Refining Word Alignment with Discriminative Training

Nadi Tomeh; Alexandre Allauzen; François Yvon; Guillaume Wisniewski
{'id': 'https://openalex.org/W3170138813', 'doi': None, 'title': 'Refining Word Alignment with Discriminative Training', 'display_name': 'Refining Word Alignment with Discriminative Training', 'publication_year': 2010, 'publication_date': '2010-01-01', 'ids': {'openalex': 'https://openalex.org/W3170138813', 'mag': '3170138813'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://aclanthology.org/2010.amta-papers.18/', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S4306418009', 'display_name': 'Conference of the Association for Machine Translation in the Americas', 'issn_l': None, 'issn': None, 'is_oa': False, 'is_in_doaj': False, 'is_core': False, 'host_organization': None, 'host_organization_name': None, 'host_organization_lineage': [], 'host_organization_lineage_names': [], 'type': 'conference'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'proceedings-article', 'indexed_in': [], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5007428477', 'display_name': 'Nadi Tomeh', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I33976269', 'display_name': 'Xerox (France)', 'ror': 'https://ror.org/033q0mv79', 'country_code': 'FR', 'type': 'company', 'lineage': ['https://openalex.org/I33976269', 'https://openalex.org/I4210132870']}], 'countries': ['FR'], 'is_corresponding': False, 'raw_author_name': 'Nadi Tomeh', 'raw_affiliation_strings': ['Xerox (France), Saint-Denis, France'], 'affiliations': [{'raw_affiliation_string': 'Xerox (France), Saint-Denis, France', 'institution_ids': ['https://openalex.org/I33976269']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5051652099', 'display_name': 'Alexandre Allauzen', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I102197404', 'display_name': 'Université Paris-Sud', 'ror': 'https://ror.org/028rypz17', 'country_code': 'FR', 'type': 'education', 'lineage': ['https://openalex.org/I102197404']}], 'countries': ['FR'], 'is_corresponding': False, 'raw_author_name': 'Alexandre Allauzen', 'raw_affiliation_strings': ['Université Paris-Sud, Orsay, France'], 'affiliations': [{'raw_affiliation_string': 'Université Paris-Sud, Orsay, France', 'institution_ids': ['https://openalex.org/I102197404']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5030615769', 'display_name': 'François Yvon', 'orcid': 'https://orcid.org/0000-0002-7972-7442'}, 'institutions': [{'id': 'https://openalex.org/I1294671590', 'display_name': 'Centre National de la Recherche Scientifique', 'ror': 'https://ror.org/02feahw73', 'country_code': 'FR', 'type': 'government', 'lineage': ['https://openalex.org/I1294671590']}], 'countries': ['FR'], 'is_corresponding': False, 'raw_author_name': 'François Yvon', 'raw_affiliation_strings': ['Centre National de la Recherche Scientifique, Paris, France'], 'affiliations': [{'raw_affiliation_string': 'Centre National de la Recherche Scientifique, Paris, France', 'institution_ids': ['https://openalex.org/I1294671590']}]}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5013670630', 'display_name': 'Guillaume Wisniewski', 'orcid': 'https://orcid.org/0000-0002-4445-080X'}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'Guillaume Wisniewski', 'raw_affiliation_strings': [], 'affiliations': []}], 'institution_assertions': [], 'countries_distinct_count': 1, 'institutions_distinct_count': 3, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 1.207, 'has_fulltext': False, 'cited_by_count': 6, 'citation_normalized_percentile': {'value': 0.553641, 'is_in_top_1_percent': False, 'is_in_top_10_percent': False}, 'cited_by_percentile_year': {'min': 81, 'max': 82}, 'biblio': {'volume': None, 'issue': None, 'first_page': None, 'last_page': None}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T10181', 'display_name': 'Statistical Machine Translation and Natural Language Processing', 'score': 1.0, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T10181', 'display_name': 'Statistical Machine Translation and Natural Language Processing', 'score': 1.0, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10028', 'display_name': 'Natural Language Processing', 'score': 0.9991, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10601', 'display_name': 'Handwriting Recognition and Text Detection', 'score': 0.9897, 'subfield': {'id': 'https://openalex.org/subfields/1707', 'display_name': 'Computer Vision and Pattern Recognition'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/discriminative-model', 'display_name': 'Discriminative model', 'score': 0.6225297}, {'id': 'https://openalex.org/keywords/language-modeling', 'display_name': 'Language Modeling', 'score': 0.563448}, {'id': 'https://openalex.org/keywords/syntax-based-translation-models', 'display_name': 'Syntax-based Translation Models', 'score': 0.559887}, {'id': 'https://openalex.org/keywords/machine-translation', 'display_name': 'Machine Translation', 'score': 0.556198}, {'id': 'https://openalex.org/keywords/neural-machine-translation', 'display_name': 'Neural Machine Translation', 'score': 0.545091}, {'id': 'https://openalex.org/keywords/topic-modeling', 'display_name': 'Topic Modeling', 'score': 0.542999}, {'id': 'https://openalex.org/keywords/ibm', 'display_name': 'IBM', 'score': 0.45857623}, {'id': 'https://openalex.org/keywords/feature', 'display_name': 'Feature (linguistics)', 'score': 0.41542053}], 'concepts': [{'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.8181755}, {'id': 'https://openalex.org/C90805587', 'wikidata': 'https://www.wikidata.org/wiki/Q10944557', 'display_name': 'Word (group theory)', 'level': 2, 'score': 0.66101485}, {'id': 'https://openalex.org/C154945302', 'wikidata': 'https://www.wikidata.org/wiki/Q11660', 'display_name': 'Artificial intelligence', 'level': 1, 'score': 0.63550985}, {'id': 'https://openalex.org/C203005215', 'wikidata': 'https://www.wikidata.org/wiki/Q79798', 'display_name': 'Machine translation', 'level': 2, 'score': 0.6275726}, {'id': 'https://openalex.org/C97931131', 'wikidata': 'https://www.wikidata.org/wiki/Q5282087', 'display_name': 'Discriminative model', 'level': 2, 'score': 0.6225297}, {'id': 'https://openalex.org/C149364088', 'wikidata': 'https://www.wikidata.org/wiki/Q185917', 'display_name': 'Translation (biology)', 'level': 4, 'score': 0.5554267}, {'id': 'https://openalex.org/C204321447', 'wikidata': 'https://www.wikidata.org/wiki/Q30642', 'display_name': 'Natural language processing', 'level': 1, 'score': 0.5553813}, {'id': 'https://openalex.org/C45374587', 'wikidata': 'https://www.wikidata.org/wiki/Q12525525', 'display_name': 'Computation', 'level': 2, 'score': 0.47448018}, {'id': 'https://openalex.org/C70388272', 'wikidata': 'https://www.wikidata.org/wiki/Q5968558', 'display_name': 'IBM', 'level': 2, 'score': 0.45857623}, {'id': 'https://openalex.org/C2780451532', 'wikidata': 'https://www.wikidata.org/wiki/Q759676', 'display_name': 'Task (project management)', 'level': 2, 'score': 0.44869143}, {'id': 'https://openalex.org/C2776401178', 'wikidata': 'https://www.wikidata.org/wiki/Q12050496', 'display_name': 'Feature (linguistics)', 'level': 2, 'score': 0.41542053}, {'id': 'https://openalex.org/C119857082', 'wikidata': 'https://www.wikidata.org/wiki/Q2539', 'display_name': 'Machine learning', 'level': 1, 'score': 0.3849241}, {'id': 'https://openalex.org/C153180895', 'wikidata': 'https://www.wikidata.org/wiki/Q7148389', 'display_name': 'Pattern recognition (psychology)', 'level': 2, 'score': 0.3256728}, {'id': 'https://openalex.org/C11413529', 'wikidata': 'https://www.wikidata.org/wiki/Q8366', 'display_name': 'Algorithm', 'level': 1, 'score': 0.21127391}, {'id': 'https://openalex.org/C138885662', 'wikidata': 'https://www.wikidata.org/wiki/Q5891', 'display_name': 'Philosophy', 'level': 0, 'score': 0.0}, {'id': 'https://openalex.org/C41895202', 'wikidata': 'https://www.wikidata.org/wiki/Q8162', 'display_name': 'Linguistics', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C55493867', 'wikidata': 'https://www.wikidata.org/wiki/Q7094', 'display_name': 'Biochemistry', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C185592680', 'wikidata': 'https://www.wikidata.org/wiki/Q2329', 'display_name': 'Chemistry', 'level': 0, 'score': 0.0}, {'id': 'https://openalex.org/C192562407', 'wikidata': 'https://www.wikidata.org/wiki/Q228736', 'display_name': 'Materials science', 'level': 0, 'score': 0.0}, {'id': 'https://openalex.org/C187736073', 'wikidata': 'https://www.wikidata.org/wiki/Q2920921', 'display_name': 'Management', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C105580179', 'wikidata': 'https://www.wikidata.org/wiki/Q188928', 'display_name': 'Messenger RNA', 'level': 3, 'score': 0.0}, {'id': 'https://openalex.org/C162324750', 'wikidata': 'https://www.wikidata.org/wiki/Q8134', 'display_name': 'Economics', 'level': 0, 'score': 0.0}, {'id': 'https://openalex.org/C104317684', 'wikidata': 'https://www.wikidata.org/wiki/Q7187', 'display_name': 'Gene', 'level': 2, 'score': 0.0}, {'id': 'https://openalex.org/C171250308', 'wikidata': 'https://www.wikidata.org/wiki/Q11468', 'display_name': 'Nanotechnology', 'level': 1, 'score': 0.0}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://aclanthology.org/2010.amta-papers.18/', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S4306418009', 'display_name': 'Conference of the Association for Machine Translation in the Americas', 'issn_l': None, 'issn': None, 'is_oa': False, 'is_in_doaj': False, 'is_core': False, 'host_organization': None, 'host_organization_name': None, 'host_organization_lineage': [], 'host_organization_lineage_names': [], 'type': 'conference'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [{'display_name': 'Reduced inequalities', 'score': 0.76, 'id': 'https://metadata.un.org/sdg/10'}], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 23, 'referenced_works': ['https://openalex.org/W2006969979', 'https://openalex.org/W2026864236', 'https://openalex.org/W2058839679', 'https://openalex.org/W2086202918', 'https://openalex.org/W2101105183', 'https://openalex.org/W2112514265', 'https://openalex.org/W2117813082', 'https://openalex.org/W2119224513', 'https://openalex.org/W2122244345', 'https://openalex.org/W2124136969', 'https://openalex.org/W2124810114', 'https://openalex.org/W2126946601', 'https://openalex.org/W2127686544', 'https://openalex.org/W2130907484', 'https://openalex.org/W2135161317', 'https://openalex.org/W2136646623', 'https://openalex.org/W2138706636', 'https://openalex.org/W2144169942', 'https://openalex.org/W2146574666', 'https://openalex.org/W2156985047', 'https://openalex.org/W2160245618', 'https://openalex.org/W28412257', 'https://openalex.org/W60337842'], 'related_works': ['https://openalex.org/W988067430', 'https://openalex.org/W610424550', 'https://openalex.org/W3030128163', 'https://openalex.org/W2971296520', 'https://openalex.org/W2970045405', 'https://openalex.org/W2760240491', 'https://openalex.org/W2521715639', 'https://openalex.org/W2394565853', 'https://openalex.org/W2265436984', 'https://openalex.org/W2250963088', 'https://openalex.org/W2181790585', 'https://openalex.org/W2121524931', 'https://openalex.org/W2116679574', 'https://openalex.org/W2116177118', 'https://openalex.org/W2113351918', 'https://openalex.org/W2087551996', 'https://openalex.org/W1993254400', 'https://openalex.org/W1973923101', 'https://openalex.org/W1742951243', 'https://openalex.org/W128700830'], 'abstract_inverted_index': {'The': [0, 42], 'quality': [1, 10], 'of': [2, 11, 76, 92], 'statistical': [3], 'machine': [4], 'translation': [5, 20, 46, 119, 131], 'systems': [6], 'depends': [7], 'on': [8], 'the': [9, 12, 19, 30, 34, 82, 90, 111], 'word': [13], 'alignments': [14, 44, 100, 115], 'that': [15], 'are': [16, 48], 'computed': [17], 'during': [18], 'model': [21], 'training': [22], 'phase.': [23], 'IBM': [24, 112], 'alignment': [25, 72, 83], 'models,': [26, 120], 'as': [27, 73], 'implemented': [28], 'in': [29, 81, 124], 'GIZA++': [31], 'toolkit,': [32], 'constitute': [33], 'de': [35], 'facto': [36], 'standard': [37], 'for': [38], 'performing': [39], 'these': [40], 'computations.': [41], 'resulting': [43], 'and': [45, 52, 67, 89], 'models': [47], 'however': [49], 'very': [50], 'noisy,': [51], 'several': [53], 'authors': [54], 'have': [55], 'tried': [56], 'to': [57, 98, 103, 129], 'improve': [58], 'them.': [59], 'In': [60], 'this': [61], 'work,': [62], 'we': [63, 95], 'propose': [64], 'a': [65, 74, 125], 'simple': [66], 'effective': [68], 'approach,': [69], 'which': [70], 'considers': [71], 'series': [75], 'independent': [77], 'binary': [78], 'classification': [79], 'problems': [80], 'matrix.': [84], 'Through': [85], 'extensive': [86], 'feature': [87], 'engineering': [88], 'use': [91], 'stacking': [93], 'techniques,': [94], 'were': [96], 'able': [97], 'obtain': [99], 'much': [101], 'closer': [102], 'manually': [104], 'defined': [105], 'references': [106], 'than': [107], 'those': [108], 'obtained': [109], 'by': [110], 'models.': [113], 'These': [114], 'also': [116], 'yield': [117], 'better': [118], 'delivering': [121], 'improved': [122], 'performance': [123], 'large': [126], 'scale': [127], 'Arabic': [128], 'English': [130], 'task.': [132]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W3170138813', 'counts_by_year': [{'year': 2014, 'cited_by_count': 1}, {'year': 2013, 'cited_by_count': 3}], 'updated_date': '2024-09-15T06:38:54.585563', 'created_date': '2021-06-22'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works