Machine Translation of Arabic Dialects

Rabih Zbib; Erika Malchiodi; Jacob Devlin; David Stallard; Spyros Matsoukas; Richard Schwartz; J. Makhoul; Omar F. Zaidan; Chris Callison-Burch
{'id': 'https://openalex.org/W137989762', 'doi': None, 'title': 'Machine Translation of Arabic Dialects', 'display_name': 'Machine Translation of Arabic Dialects', 'publication_year': 2012, 'publication_date': '2012-06-03', 'ids': {'openalex': 'https://openalex.org/W137989762', 'mag': '137989762'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://www.aclweb.org/anthology/N12-1006.pdf', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S4306420633', 'display_name': 'North American Chapter of the Association for Computational Linguistics', 'issn_l': None, 'issn': None, 'is_oa': False, 'is_in_doaj': False, 'is_core': False, 'host_organization': None, 'host_organization_name': None, 'host_organization_lineage': [], 'host_organization_lineage_names': [], 'type': 'conference'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'proceedings-article', 'indexed_in': [], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5024402344', 'display_name': 'Rabih Zbib', 'orcid': 'https://orcid.org/0000-0002-7140-3048'}, 'institutions': [{'id': 'https://openalex.org/I1306686416', 'display_name': 'RTX (United States)', 'ror': 'https://ror.org/0354t7b78', 'country_code': 'US', 'type': 'company', 'lineage': ['https://openalex.org/I1306686416']}], 'countries': ['US'], 'is_corresponding': False, 'raw_author_name': 'Rabih Zbib', 'raw_affiliation_strings': ['[Raytheon BBN Technologies, Cambridge, MA]'], 'affiliations': [{'raw_affiliation_string': '[Raytheon BBN Technologies, Cambridge, MA]', 'institution_ids': ['https://openalex.org/I1306686416']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5016598951', 'display_name': 'Erika Malchiodi', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I1306686416', 'display_name': 'RTX (United States)', 'ror': 'https://ror.org/0354t7b78', 'country_code': 'US', 'type': 'company', 'lineage': ['https://openalex.org/I1306686416']}], 'countries': ['US'], 'is_corresponding': False, 'raw_author_name': 'Erika Malchiodi', 'raw_affiliation_strings': ['[Raytheon BBN Technologies, Cambridge, MA]'], 'affiliations': [{'raw_affiliation_string': '[Raytheon BBN Technologies, Cambridge, MA]', 'institution_ids': ['https://openalex.org/I1306686416']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5057457287', 'display_name': 'Jacob Devlin', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I1306686416', 'display_name': 'RTX (United States)', 'ror': 'https://ror.org/0354t7b78', 'country_code': 'US', 'type': 'company', 'lineage': ['https://openalex.org/I1306686416']}], 'countries': ['US'], 'is_corresponding': False, 'raw_author_name': 'Jacob Devlin', 'raw_affiliation_strings': ['[Raytheon BBN Technologies, Cambridge, MA]'], 'affiliations': [{'raw_affiliation_string': '[Raytheon BBN Technologies, Cambridge, MA]', 'institution_ids': ['https://openalex.org/I1306686416']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5011210632', 'display_name': 'David Stallard', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I1306686416', 'display_name': 'RTX (United States)', 'ror': 'https://ror.org/0354t7b78', 'country_code': 'US', 'type': 'company', 'lineage': ['https://openalex.org/I1306686416']}], 'countries': ['US'], 'is_corresponding': False, 'raw_author_name': 'David Stallard', 'raw_affiliation_strings': ['[Raytheon BBN Technologies, Cambridge, MA]'], 'affiliations': [{'raw_affiliation_string': '[Raytheon BBN Technologies, Cambridge, MA]', 'institution_ids': ['https://openalex.org/I1306686416']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5034173999', 'display_name': 'Spyros Matsoukas', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I1306686416', 'display_name': 'RTX (United States)', 'ror': 'https://ror.org/0354t7b78', 'country_code': 'US', 'type': 'company', 'lineage': ['https://openalex.org/I1306686416']}], 'countries': ['US'], 'is_corresponding': False, 'raw_author_name': 'Spyros Matsoukas', 'raw_affiliation_strings': ['[Raytheon BBN Technologies, Cambridge, MA]'], 'affiliations': [{'raw_affiliation_string': '[Raytheon BBN Technologies, Cambridge, MA]', 'institution_ids': ['https://openalex.org/I1306686416']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5089653752', 'display_name': 'Richard Schwartz', 'orcid': 'https://orcid.org/0000-0002-4654-624X'}, 'institutions': [{'id': 'https://openalex.org/I1306686416', 'display_name': 'RTX (United States)', 'ror': 'https://ror.org/0354t7b78', 'country_code': 'US', 'type': 'company', 'lineage': ['https://openalex.org/I1306686416']}], 'countries': ['US'], 'is_corresponding': False, 'raw_author_name': 'Richard Schwartz', 'raw_affiliation_strings': ['[Raytheon BBN Technologies, Cambridge, MA]'], 'affiliations': [{'raw_affiliation_string': '[Raytheon BBN Technologies, Cambridge, MA]', 'institution_ids': ['https://openalex.org/I1306686416']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5065759416', 'display_name': 'J. Makhoul', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I1306686416', 'display_name': 'RTX (United States)', 'ror': 'https://ror.org/0354t7b78', 'country_code': 'US', 'type': 'company', 'lineage': ['https://openalex.org/I1306686416']}], 'countries': ['US'], 'is_corresponding': False, 'raw_author_name': 'John Makhoul', 'raw_affiliation_strings': ['[Raytheon BBN Technologies, Cambridge, MA]'], 'affiliations': [{'raw_affiliation_string': '[Raytheon BBN Technologies, Cambridge, MA]', 'institution_ids': ['https://openalex.org/I1306686416']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5089575055', 'display_name': 'Omar F. Zaidan', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I1290206253', 'display_name': 'Microsoft (United States)', 'ror': 'https://ror.org/00d0nc645', 'country_code': 'US', 'type': 'company', 'lineage': ['https://openalex.org/I1290206253']}], 'countries': ['US'], 'is_corresponding': False, 'raw_author_name': 'Omar F. Zaidan', 'raw_affiliation_strings': ['Microsoft Research, Redmond, WA'], 'affiliations': [{'raw_affiliation_string': 'Microsoft Research, Redmond, WA', 'institution_ids': ['https://openalex.org/I1290206253']}]}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5068508539', 'display_name': 'Chris Callison-Burch', 'orcid': 'https://orcid.org/0000-0001-8196-1943'}, 'institutions': [{'id': 'https://openalex.org/I145311948', 'display_name': 'Johns Hopkins University', 'ror': 'https://ror.org/00za53h95', 'country_code': 'US', 'type': 'education', 'lineage': ['https://openalex.org/I145311948']}], 'countries': ['US'], 'is_corresponding': False, 'raw_author_name': 'Chris Callison-Burch', 'raw_affiliation_strings': ['Johns Hopkins University, baltimore, MD'], 'affiliations': [{'raw_affiliation_string': 'Johns Hopkins University, baltimore, MD', 'institution_ids': ['https://openalex.org/I145311948']}]}], 'institution_assertions': [], 'countries_distinct_count': 1, 'institutions_distinct_count': 3, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 12.815, 'has_fulltext': False, 'cited_by_count': 163, 'citation_normalized_percentile': {'value': 0.976703, 'is_in_top_1_percent': False, 'is_in_top_10_percent': True}, 'cited_by_percentile_year': {'min': 98, 'max': 99}, 'biblio': {'volume': None, 'issue': None, 'first_page': '49', 'last_page': '59'}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T10181', 'display_name': 'Statistical Machine Translation and Natural Language Processing', 'score': 1.0, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T10181', 'display_name': 'Statistical Machine Translation and Natural Language Processing', 'score': 1.0, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10028', 'display_name': 'Natural Language Processing', 'score': 0.9981, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T12380', 'display_name': 'Authorship Attribution and User Profiling in Text', 'score': 0.9847, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/crowdsourcing', 'display_name': 'Crowdsourcing', 'score': 0.65138173}, {'id': 'https://openalex.org/keywords/modern-standard-arabic', 'display_name': 'Modern Standard Arabic', 'score': 0.6069597}, {'id': 'https://openalex.org/keywords/machine-translation', 'display_name': 'Machine Translation', 'score': 0.599352}, {'id': 'https://openalex.org/keywords/neural-machine-translation', 'display_name': 'Neural Machine Translation', 'score': 0.574264}, {'id': 'https://openalex.org/keywords/multilingual-neural-machine-translation', 'display_name': 'Multilingual Neural Machine Translation', 'score': 0.550569}, {'id': 'https://openalex.org/keywords/statistical-machine-translation', 'display_name': 'Statistical Machine Translation', 'score': 0.547836}, {'id': 'https://openalex.org/keywords/syntax-based-translation-models', 'display_name': 'Syntax-based Translation Models', 'score': 0.528468}, {'id': 'https://openalex.org/keywords/parallel-corpora', 'display_name': 'Parallel corpora', 'score': 0.5082497}], 'concepts': [{'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.78037107}, {'id': 'https://openalex.org/C96455323', 'wikidata': 'https://www.wikidata.org/wiki/Q13955', 'display_name': 'Arabic', 'level': 2, 'score': 0.76175594}, {'id': 'https://openalex.org/C204321447', 'wikidata': 'https://www.wikidata.org/wiki/Q30642', 'display_name': 'Natural language processing', 'level': 1, 'score': 0.71205914}, {'id': 'https://openalex.org/C203005215', 'wikidata': 'https://www.wikidata.org/wiki/Q79798', 'display_name': 'Machine translation', 'level': 2, 'score': 0.7010811}, {'id': 'https://openalex.org/C62230096', 'wikidata': 'https://www.wikidata.org/wiki/Q275969', 'display_name': 'Crowdsourcing', 'level': 2, 'score': 0.65138173}, {'id': 'https://openalex.org/C154945302', 'wikidata': 'https://www.wikidata.org/wiki/Q11660', 'display_name': 'Artificial intelligence', 'level': 1, 'score': 0.62305415}, {'id': 'https://openalex.org/C2778243841', 'wikidata': 'https://www.wikidata.org/wiki/Q56467', 'display_name': 'Modern Standard Arabic', 'level': 3, 'score': 0.6069597}, {'id': 'https://openalex.org/C2985367798', 'wikidata': 'https://www.wikidata.org/wiki/Q1346592', 'display_name': 'Parallel corpora', 'level': 3, 'score': 0.5082497}, {'id': 'https://openalex.org/C90805587', 'wikidata': 'https://www.wikidata.org/wiki/Q10944557', 'display_name': 'Word (group theory)', 'level': 2, 'score': 0.5033044}, {'id': 'https://openalex.org/C41895202', 'wikidata': 'https://www.wikidata.org/wiki/Q8162', 'display_name': 'Linguistics', 'level': 1, 'score': 0.34740025}, {'id': 'https://openalex.org/C136764020', 'wikidata': 'https://www.wikidata.org/wiki/Q466', 'display_name': 'World Wide Web', 'level': 1, 'score': 0.11749011}, {'id': 'https://openalex.org/C138885662', 'wikidata': 'https://www.wikidata.org/wiki/Q5891', 'display_name': 'Philosophy', 'level': 0, 'score': 0.0}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://www.aclweb.org/anthology/N12-1006.pdf', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S4306420633', 'display_name': 'North American Chapter of the Association for Computational Linguistics', 'issn_l': None, 'issn': None, 'is_oa': False, 'is_in_doaj': False, 'is_core': False, 'host_organization': None, 'host_organization_name': None, 'host_organization_lineage': [], 'host_organization_lineage_names': [], 'type': 'conference'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 21, 'referenced_works': ['https://openalex.org/W147720932', 'https://openalex.org/W1544567521', 'https://openalex.org/W181901128', 'https://openalex.org/W1909398668', 'https://openalex.org/W1957504465', 'https://openalex.org/W2017802499', 'https://openalex.org/W2100976324', 'https://openalex.org/W2101105183', 'https://openalex.org/W2109704865', 'https://openalex.org/W2123301721', 'https://openalex.org/W2127849236', 'https://openalex.org/W2135161317', 'https://openalex.org/W2140343992', 'https://openalex.org/W2147272182', 'https://openalex.org/W2149327368', 'https://openalex.org/W2156985047', 'https://openalex.org/W2163361328', 'https://openalex.org/W2166905217', 'https://openalex.org/W2168576900', 'https://openalex.org/W2294764339', 'https://openalex.org/W3170253630'], 'related_works': ['https://openalex.org/W3170253630', 'https://openalex.org/W2394969460', 'https://openalex.org/W2251986002', 'https://openalex.org/W2250816155', 'https://openalex.org/W2250783208', 'https://openalex.org/W2250732891', 'https://openalex.org/W2250414785', 'https://openalex.org/W2168576900', 'https://openalex.org/W2164503643', 'https://openalex.org/W2160802179', 'https://openalex.org/W2156985047', 'https://openalex.org/W2156554947', 'https://openalex.org/W2147272182', 'https://openalex.org/W2124807415', 'https://openalex.org/W2109704865', 'https://openalex.org/W2101105183', 'https://openalex.org/W2100976324', 'https://openalex.org/W1957504465', 'https://openalex.org/W1544567521', 'https://openalex.org/W131663347'], 'abstract_inverted_index': {'Arabic': [0, 49, 65, 92, 106], 'Dialects': [1], 'present': [2], 'many': [3], 'challenges': [4], 'for': [5], 'machine': [6], 'translation,': [7], 'not': [8], 'least': [9], 'of': [10, 15, 32, 48, 73], 'which': [11], 'is': [12], 'the': [13], 'lack': [14], 'data': [16, 61, 75], 'resources.': [17], 'We': [18, 58], 'use': [19, 59], 'crowdsourcing': [20], 'to': [21, 62], 'cheaply': [22], 'and': [23, 27, 35, 52, 68, 86, 97], 'quickly': [24], 'build': [25, 63], 'Levantine-English': [26], 'Egyptian-English': [28], 'parallel': [29, 114], 'corpora,': [30], 'consisting': [31], '1.1M': [33], 'words': [34], '380k': [36], 'words,': [37], 'respectively.': [38], 'The': [39], 'dialectal': [40, 74], 'sentences': [41], 'are': [42], 'selected': [43], 'from': [44], 'a': [45, 77, 103, 111], 'large': [46], 'corpus': [47], 'web': [50], 'text,': [51], 'translated': [53], 'using': [54], "Amazon's": [55], 'Mechanical': [56], 'Turk.': [57], 'this': [60], 'Dialectal': [64, 91], 'MT': [66, 93, 107], 'systems,': [67], 'find': [69], 'that': [70], 'small': [71], 'amounts': [72], 'have': [76], 'dramatic': [78], 'impact': [79], 'on': [80, 110], 'translation': [81], 'quality.': [82], 'When': [83], 'translating': [84], 'Egyptian': [85], 'Levantine': [87], 'test': [88], 'sets,': [89], 'our': [90], 'system': [94, 108], 'performs': [95], '6.3': [96], '7.0': [98], 'BLEU': [99], 'points': [100], 'higher': [101], 'than': [102], 'Modern': [104], 'Standard': [105], 'trained': [109], '150M-word': [112], 'Arabic-English': [113], 'corpus.': [115]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W137989762', 'counts_by_year': [{'year': 2023, 'cited_by_count': 10}, {'year': 2022, 'cited_by_count': 5}, {'year': 2021, 'cited_by_count': 9}, {'year': 2020, 'cited_by_count': 20}, {'year': 2019, 'cited_by_count': 16}, {'year': 2018, 'cited_by_count': 13}, {'year': 2017, 'cited_by_count': 18}, {'year': 2016, 'cited_by_count': 10}, {'year': 2015, 'cited_by_count': 16}, {'year': 2014, 'cited_by_count': 22}, {'year': 2013, 'cited_by_count': 18}, {'year': 2012, 'cited_by_count': 5}], 'updated_date': '2024-09-20T00:44:17.192910', 'created_date': '2016-06-24'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works