Exploring ASR-free end-to-end modeling to improve spoken language understanding in a cloud-based dialog system

Yao Qian; Rutuja Ubale; Vikram Ramanaryanan; Patrick Lange; David Suendermann‐Oeft; Keelan Evanini; Eugene Tsuprun
{'id': 'https://openalex.org/W2786839803', 'doi': 'https://doi.org/10.1109/asru.2017.8268987', 'title': 'Exploring ASR-free end-to-end modeling to improve spoken language understanding in a cloud-based dialog system', 'display_name': 'Exploring ASR-free end-to-end modeling to improve spoken language understanding in a cloud-based dialog system', 'publication_year': 2017, 'publication_date': '2017-12-01', 'ids': {'openalex': 'https://openalex.org/W2786839803', 'doi': 'https://doi.org/10.1109/asru.2017.8268987', 'mag': '2786839803'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/asru.2017.8268987', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S4306498158', 'display_name': '2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)', 'issn_l': None, 'issn': None, 'is_oa': False, 'is_in_doaj': False, 'is_core': True, 'host_organization': None, 'host_organization_name': None, 'host_organization_lineage': [], 'host_organization_lineage_names': [], 'type': 'journal'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'proceedings-article', 'indexed_in': ['crossref'], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5100342006', 'display_name': 'Yao Qian', 'orcid': 'https://orcid.org/0000-0003-1855-9630'}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'Yao Qian', 'raw_affiliation_strings': [], 'affiliations': []}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5030738204', 'display_name': 'Rutuja Ubale', 'orcid': None}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'Rutuja Ubale', 'raw_affiliation_strings': [], 'affiliations': []}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5043115328', 'display_name': 'Vikram Ramanaryanan', 'orcid': None}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'Vikram Ramanaryanan', 'raw_affiliation_strings': [], 'affiliations': []}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5029778136', 'display_name': 'Patrick Lange', 'orcid': 'https://orcid.org/0000-0003-3935-663X'}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'Patrick Lange', 'raw_affiliation_strings': [], 'affiliations': []}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5008930900', 'display_name': 'David Suendermann‐Oeft', 'orcid': None}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'David Suendermann-Oeft', 'raw_affiliation_strings': [], 'affiliations': []}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5030816616', 'display_name': 'Keelan Evanini', 'orcid': 'https://orcid.org/0000-0003-4243-3376'}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'Keelan Evanini', 'raw_affiliation_strings': [], 'affiliations': []}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5030778828', 'display_name': 'Eugene Tsuprun', 'orcid': None}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'Eugene Tsuprun', 'raw_affiliation_strings': [], 'affiliations': []}], 'institution_assertions': [], 'countries_distinct_count': 0, 'institutions_distinct_count': 0, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 6.186, 'has_fulltext': True, 'fulltext_origin': 'ngrams', 'cited_by_count': 64, 'citation_normalized_percentile': {'value': 0.889309, 'is_in_top_1_percent': False, 'is_in_top_10_percent': False}, 'cited_by_percentile_year': {'min': 97, 'max': 98}, 'biblio': {'volume': None, 'issue': None, 'first_page': None, 'last_page': None}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T12031', 'display_name': 'Dialogue Act Modeling for Spoken Language Systems', 'score': 0.9998, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T12031', 'display_name': 'Dialogue Act Modeling for Spoken Language Systems', 'score': 0.9998, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10201', 'display_name': 'Speech Recognition Technology', 'score': 0.9995, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10028', 'display_name': 'Natural Language Processing', 'score': 0.9994, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/dialog-system', 'display_name': 'Dialog system', 'score': 0.747703}, {'id': 'https://openalex.org/keywords/spoken-language', 'display_name': 'Spoken language', 'score': 0.7252113}, {'id': 'https://openalex.org/keywords/natural-language-understanding', 'display_name': 'Natural language understanding', 'score': 0.62985957}, {'id': 'https://openalex.org/keywords/spoken-dialogue-systems', 'display_name': 'Spoken Dialogue Systems', 'score': 0.604077}, {'id': 'https://openalex.org/keywords/end-to-end-speech-recognition', 'display_name': 'End-to-End Speech Recognition', 'score': 0.592974}, {'id': 'https://openalex.org/keywords/automatic-speech-recognition', 'display_name': 'Automatic Speech Recognition', 'score': 0.541321}, {'id': 'https://openalex.org/keywords/statistical-language-modeling', 'display_name': 'Statistical Language Modeling', 'score': 0.539734}, {'id': 'https://openalex.org/keywords/acoustic-modeling', 'display_name': 'Acoustic Modeling', 'score': 0.536661}], 'concepts': [{'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.8900046}, {'id': 'https://openalex.org/C173853756', 'wikidata': 'https://www.wikidata.org/wiki/Q86915', 'display_name': 'Dialog box', 'level': 2, 'score': 0.8172269}, {'id': 'https://openalex.org/C190954187', 'wikidata': 'https://www.wikidata.org/wiki/Q5270587', 'display_name': 'Dialog system', 'level': 3, 'score': 0.747703}, {'id': 'https://openalex.org/C2776230583', 'wikidata': 'https://www.wikidata.org/wiki/Q1322198', 'display_name': 'Spoken language', 'level': 2, 'score': 0.7252113}, {'id': 'https://openalex.org/C2779439875', 'wikidata': 'https://www.wikidata.org/wiki/Q1078276', 'display_name': 'Natural language understanding', 'level': 3, 'score': 0.62985957}, {'id': 'https://openalex.org/C204321447', 'wikidata': 'https://www.wikidata.org/wiki/Q30642', 'display_name': 'Natural language processing', 'level': 1, 'score': 0.5768085}, {'id': 'https://openalex.org/C28490314', 'wikidata': 'https://www.wikidata.org/wiki/Q189436', 'display_name': 'Speech recognition', 'level': 1, 'score': 0.510787}, {'id': 'https://openalex.org/C154945302', 'wikidata': 'https://www.wikidata.org/wiki/Q11660', 'display_name': 'Artificial intelligence', 'level': 1, 'score': 0.5085157}, {'id': 'https://openalex.org/C195324797', 'wikidata': 'https://www.wikidata.org/wiki/Q33742', 'display_name': 'Natural language', 'level': 2, 'score': 0.494478}, {'id': 'https://openalex.org/C137293760', 'wikidata': 'https://www.wikidata.org/wiki/Q3621696', 'display_name': 'Language model', 'level': 2, 'score': 0.48554012}, {'id': 'https://openalex.org/C101468663', 'wikidata': 'https://www.wikidata.org/wiki/Q1620158', 'display_name': 'Modular design', 'level': 2, 'score': 0.46641266}, {'id': 'https://openalex.org/C79974875', 'wikidata': 'https://www.wikidata.org/wiki/Q483639', 'display_name': 'Cloud computing', 'level': 2, 'score': 0.44049418}, {'id': 'https://openalex.org/C199360897', 'wikidata': 'https://www.wikidata.org/wiki/Q9143', 'display_name': 'Programming language', 'level': 1, 'score': 0.12688243}, {'id': 'https://openalex.org/C136764020', 'wikidata': 'https://www.wikidata.org/wiki/Q466', 'display_name': 'World Wide Web', 'level': 1, 'score': 0.11798641}, {'id': 'https://openalex.org/C111919701', 'wikidata': 'https://www.wikidata.org/wiki/Q9135', 'display_name': 'Operating system', 'level': 1, 'score': 0.0}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/asru.2017.8268987', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S4306498158', 'display_name': '2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)', 'issn_l': None, 'issn': None, 'is_oa': False, 'is_in_doaj': False, 'is_core': True, 'host_organization': None, 'host_organization_name': None, 'host_organization_lineage': [], 'host_organization_lineage_names': [], 'type': 'journal'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [{'score': 0.82, 'id': 'https://metadata.un.org/sdg/4', 'display_name': 'Quality education'}], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 44, 'referenced_works': ['https://openalex.org/W1032614754', 'https://openalex.org/W149113674', 'https://openalex.org/W1505640990', 'https://openalex.org/W1524333225', 'https://openalex.org/W1806039879', 'https://openalex.org/W1936920915', 'https://openalex.org/W2049759087', 'https://openalex.org/W2064675550', 'https://openalex.org/W2093973850', 'https://openalex.org/W2094472029', 'https://openalex.org/W2099877156', 'https://openalex.org/W2102113734', 'https://openalex.org/W2114925438', 'https://openalex.org/W2117520332', 'https://openalex.org/W2118620946', 'https://openalex.org/W2123379364', 'https://openalex.org/W2132037657', 'https://openalex.org/W2137871902', 'https://openalex.org/W2138857742', 'https://openalex.org/W2140279531', 'https://openalex.org/W2153501885', 'https://openalex.org/W2155524666', 'https://openalex.org/W2160306971', 'https://openalex.org/W2160815625', 'https://openalex.org/W2161466446', 'https://openalex.org/W2163922914', 'https://openalex.org/W217970951', 'https://openalex.org/W2250974548', 'https://openalex.org/W2399456070', 'https://openalex.org/W2399733683', 'https://openalex.org/W2399855989', 'https://openalex.org/W2400092632', 'https://openalex.org/W2402146185', 'https://openalex.org/W2510063686', 'https://openalex.org/W2515090196', 'https://openalex.org/W2550112318', 'https://openalex.org/W2962826786', 'https://openalex.org/W2963211739', 'https://openalex.org/W2963311389', 'https://openalex.org/W2963571336', 'https://openalex.org/W2997183031', 'https://openalex.org/W39511165', 'https://openalex.org/W4236521339', 'https://openalex.org/W97072897'], 'related_works': ['https://openalex.org/W48079147', 'https://openalex.org/W4230258867', 'https://openalex.org/W326836678', 'https://openalex.org/W3156493709', 'https://openalex.org/W2786839803', 'https://openalex.org/W2563921006', 'https://openalex.org/W2500779211', 'https://openalex.org/W2111550420', 'https://openalex.org/W1963944933', 'https://openalex.org/W1600043506'], 'abstract_inverted_index': {'Spoken': [0], 'language': [1, 13, 114], 'understanding': [2, 14], '(SLU)': [3], 'in': [4, 39, 126], 'dialog': [5, 33, 92], 'systems': [6], 'is': [7, 123, 162], 'generally': [8], 'performed': [9], 'using': [10], 'a': [11, 64, 88, 112, 139], 'natural': [12], '(NLU)': [15], 'model': [16], 'based': [17], 'on': [18, 102], 'the': [19, 58, 67, 97, 136, 151, 157], 'hypotheses': [20, 148], 'produced': [21], 'by': [22, 149], 'an': [23, 79], 'automatic': [24], 'speech': [25], 'recognition': [26], '(ASR)': [27], 'system.': [28], 'However,': [29], 'when': [30], 'new': [31], 'spoken': [32, 91], 'applications': [34], 'are': [35], 'built': [36], 'from': [37, 106, 153, 164], 'scratch': [38], 'real': [40], 'user': [41], 'environments': [42], 'that': [43, 120], 'often': [44], 'have': [45], 'sub-optimal': [46], 'audio': [47], 'characteristics,': [48], 'ASR': [49, 130, 147], 'performance': [50, 137], 'can': [51, 133], 'suffer': [52], 'due': [53], 'to': [54, 85, 166], 'factors': [55], 'such': [56], 'as': [57], 'paucity': [59], 'of': [60, 99, 138, 160], 'training': [61, 68], 'data': [62, 104], 'or': [63], 'mismatch': [65], 'between': [66], 'and': [69], 'test': [70], 'data.': [71], 'To': [72], 'address': [73], 'this': [74, 76], 'issue,': [75], 'paper': [77], 'proposes': [78], 'ASR-free,': [80], 'end-to-end': [81], '(E2E)': [82], 'modeling': [83], 'approach': [84, 101, 122], 'SLU': [86, 142, 161], 'for': [87], 'cloud-based,': [89], 'modular': [90], 'system': [93, 143], '(SDS).': [94], 'We': [95], 'evaluate': [96], 'effectiveness': [98], 'our': [100, 121], 'crowdsourced': [103], 'collected': [105], 'non-native': [107], 'English': [108], 'speakers': [109], 'interacting': [110], 'with': [111, 128, 144], 'conversational': [113], 'learning': [115], 'application.': [116], 'Experimental': [117], 'results': [118], 'show': [119], 'particularly': [124], 'promising': [125], 'situations': [127], 'low': [129], 'accuracy.': [131], 'It': [132], 'further': [134], 'improve': [135], 'sophisticated': [140], 'CNN-based': [141], 'more': [145], 'accurate': [146], 'fusing': [150], 'scores': [152], 'E2E': [154], 'system,': [155], 'i.e.,': [156], 'overall': [158], 'accuracy': [159], 'improved': [163], '85.6%': [165], '86.5%.': [167]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W2786839803', 'counts_by_year': [{'year': 2023, 'cited_by_count': 2}, {'year': 2022, 'cited_by_count': 7}, {'year': 2021, 'cited_by_count': 24}, {'year': 2020, 'cited_by_count': 19}, {'year': 2019, 'cited_by_count': 9}, {'year': 2018, 'cited_by_count': 2}], 'updated_date': '2024-09-18T20:27:53.720248', 'created_date': '2018-02-23'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works