First steps towards statistical modeling of dialogue to predict the speech act type of the next utterance

Masaaki Nagata; Takeshi Morimoto
{'id': 'https://openalex.org/W2016971098', 'doi': 'https://doi.org/10.1016/0167-6393(94)90071-x', 'title': 'First steps towards statistical modeling of dialogue to predict the speech act type of the next utterance', 'display_name': 'First steps towards statistical modeling of dialogue to predict the speech act type of the next utterance', 'publication_year': 1994, 'publication_date': '1994-12-01', 'ids': {'openalex': 'https://openalex.org/W2016971098', 'doi': 'https://doi.org/10.1016/0167-6393(94)90071-x', 'mag': '2016971098'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://doi.org/10.1016/0167-6393(94)90071-x', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S128025751', 'display_name': 'Speech Communication', 'issn_l': '0167-6393', 'issn': ['0167-6393', '1872-7182'], 'is_oa': False, 'is_in_doaj': False, 'is_core': True, 'host_organization': 'https://openalex.org/P4310320990', 'host_organization_name': 'Elsevier BV', 'host_organization_lineage': ['https://openalex.org/P4310320990'], 'host_organization_lineage_names': ['Elsevier BV'], 'type': 'journal'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'journal-article', 'indexed_in': ['crossref'], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5100520423', 'display_name': 'Masaaki Nagata', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I2251713219', 'display_name': 'NTT (Japan)', 'ror': 'https://ror.org/00berct97', 'country_code': 'JP', 'type': 'company', 'lineage': ['https://openalex.org/I2251713219']}], 'countries': ['JP'], 'is_corresponding': False, 'raw_author_name': 'Masaaki Nagata', 'raw_affiliation_strings': ['NTT Information and Communication Systems Laboratories, 1–2356 Take, Yokosuka-shi, Kanagawa, 238-03 Japan'], 'affiliations': [{'raw_affiliation_string': 'NTT Information and Communication Systems Laboratories, 1–2356 Take, Yokosuka-shi, Kanagawa, 238-03 Japan', 'institution_ids': ['https://openalex.org/I2251713219']}]}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5056531252', 'display_name': 'Takeshi Morimoto', 'orcid': 'https://orcid.org/0000-0002-4715-8003'}, 'institutions': [{'id': 'https://openalex.org/I4210104143', 'display_name': 'Advanced Telecommunications Research Institute International', 'ror': 'https://ror.org/01pe1d703', 'country_code': 'JP', 'type': 'facility', 'lineage': ['https://openalex.org/I4210104143']}], 'countries': ['JP'], 'is_corresponding': False, 'raw_author_name': 'Tsuyoshi Morimoto', 'raw_affiliation_strings': ['ATR Interpreting Telecommunications Research Laboratories, 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-02 Japan'], 'affiliations': [{'raw_affiliation_string': 'ATR Interpreting Telecommunications Research Laboratories, 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-02 Japan', 'institution_ids': ['https://openalex.org/I4210104143']}]}], 'countries_distinct_count': 1, 'institutions_distinct_count': 2, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': {'value': 2870, 'currency': 'USD', 'value_usd': 2870, 'provenance': 'doaj'}, 'apc_paid': None, 'fwci': 2.572, 'has_fulltext': True, 'fulltext_origin': 'ngrams', 'cited_by_count': 72, 'citation_normalized_percentile': {'value': 0.947153, 'is_in_top_1_percent': False, 'is_in_top_10_percent': True}, 'cited_by_percentile_year': {'min': 94, 'max': 95}, 'biblio': {'volume': '15', 'issue': '3-4', 'first_page': '193', 'last_page': '203'}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T12031', 'display_name': 'Dialogue Act Modeling for Spoken Language Systems', 'score': 0.9976, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T12031', 'display_name': 'Dialogue Act Modeling for Spoken Language Systems', 'score': 0.9976, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10028', 'display_name': 'Natural Language Processing', 'score': 0.9964, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10201', 'display_name': 'Speech Recognition Technology', 'score': 0.9799, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/perplexity', 'display_name': 'Perplexity', 'score': 0.94983506}, {'id': 'https://openalex.org/keywords/bigram', 'display_name': 'Bigram', 'score': 0.9188139}, {'id': 'https://openalex.org/keywords/utterance', 'display_name': 'Utterance', 'score': 0.80035055}, {'id': 'https://openalex.org/keywords/statistical-language-modeling', 'display_name': 'Statistical Language Modeling', 'score': 0.595501}, {'id': 'https://openalex.org/keywords/topic-modeling', 'display_name': 'Topic Modeling', 'score': 0.589404}, {'id': 'https://openalex.org/keywords/spoken-dialogue-systems', 'display_name': 'Spoken Dialogue Systems', 'score': 0.566098}, {'id': 'https://openalex.org/keywords/acoustic-modeling', 'display_name': 'Acoustic Modeling', 'score': 0.530228}, {'id': 'https://openalex.org/keywords/word-representation', 'display_name': 'Word Representation', 'score': 0.528085}], 'concepts': [{'id': 'https://openalex.org/C100279451', 'wikidata': 'https://www.wikidata.org/wiki/Q372193', 'display_name': 'Perplexity', 'level': 3, 'score': 0.94983506}, {'id': 'https://openalex.org/C137546455', 'wikidata': 'https://www.wikidata.org/wiki/Q3213474', 'display_name': 'Trigram', 'level': 2, 'score': 0.9250705}, {'id': 'https://openalex.org/C108757681', 'wikidata': 'https://www.wikidata.org/wiki/Q2773912', 'display_name': 'Bigram', 'level': 3, 'score': 0.9188139}, {'id': 'https://openalex.org/C2775852435', 'wikidata': 'https://www.wikidata.org/wiki/Q258403', 'display_name': 'Utterance', 'level': 2, 'score': 0.80035055}, {'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.7935505}, {'id': 'https://openalex.org/C28490314', 'wikidata': 'https://www.wikidata.org/wiki/Q189436', 'display_name': 'Speech recognition', 'level': 1, 'score': 0.59753263}, {'id': 'https://openalex.org/C204321447', 'wikidata': 'https://www.wikidata.org/wiki/Q30642', 'display_name': 'Natural language processing', 'level': 1, 'score': 0.59038347}, {'id': 'https://openalex.org/C2777530160', 'wikidata': 'https://www.wikidata.org/wiki/Q41796', 'display_name': 'Sentence', 'level': 2, 'score': 0.58579737}, {'id': 'https://openalex.org/C154945302', 'wikidata': 'https://www.wikidata.org/wiki/Q11660', 'display_name': 'Artificial intelligence', 'level': 1, 'score': 0.54740286}, {'id': 'https://openalex.org/C137293760', 'wikidata': 'https://www.wikidata.org/wiki/Q3621696', 'display_name': 'Language model', 'level': 2, 'score': 0.5174961}, {'id': 'https://openalex.org/C90805587', 'wikidata': 'https://www.wikidata.org/wiki/Q10944557', 'display_name': 'Word (group theory)', 'level': 2, 'score': 0.493436}, {'id': 'https://openalex.org/C41895202', 'wikidata': 'https://www.wikidata.org/wiki/Q8162', 'display_name': 'Linguistics', 'level': 1, 'score': 0.2835533}, {'id': 'https://openalex.org/C138885662', 'wikidata': 'https://www.wikidata.org/wiki/Q5891', 'display_name': 'Philosophy', 'level': 0, 'score': 0.0}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://doi.org/10.1016/0167-6393(94)90071-x', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S128025751', 'display_name': 'Speech Communication', 'issn_l': '0167-6393', 'issn': ['0167-6393', '1872-7182'], 'is_oa': False, 'is_in_doaj': False, 'is_core': True, 'host_organization': 'https://openalex.org/P4310320990', 'host_organization_name': 'Elsevier BV', 'host_organization_lineage': ['https://openalex.org/P4310320990'], 'host_organization_lineage_names': ['Elsevier BV'], 'type': 'journal'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [{'id': 'https://metadata.un.org/sdg/4', 'score': 0.58, 'display_name': 'Quality education'}], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 11, 'referenced_works': ['https://openalex.org/W169642203', 'https://openalex.org/W198170052', 'https://openalex.org/W198953705', 'https://openalex.org/W2003524025', 'https://openalex.org/W2034790125', 'https://openalex.org/W2110190189', 'https://openalex.org/W2167702024', 'https://openalex.org/W2170809941', 'https://openalex.org/W2441154163', 'https://openalex.org/W2949377437', 'https://openalex.org/W313617360'], 'related_works': ['https://openalex.org/W2397861987', 'https://openalex.org/W2223833155', 'https://openalex.org/W2105076537', 'https://openalex.org/W2056250865', 'https://openalex.org/W2041167939', 'https://openalex.org/W2020757772', 'https://openalex.org/W2016971098', 'https://openalex.org/W1903115690', 'https://openalex.org/W1700330385', 'https://openalex.org/W1602608327'], 'abstract_inverted_index': {'We': [0, 92, 153], 'propose': [1], 'a': [2, 22, 58, 160, 170], 'statistical': [3], 'dialogue': [4, 18, 86, 96, 167, 398, 415, 523, 585], 'modeling': [5], 'method': [6], 'based': [7, 71], 'on': [8, 72, 110, 548], 'the': [9, 13, 54, 73, 82, 85, 95, 100, 105, 127, 133, 145, 166, 195, 199, 205], 'information': [10, 74], 'theory': [11], 'and': [12, 45, 118, 130, 144], 'speech': [14, 29, 39, 65, 101, 180], 'act': [15, 66, 102, 181], 'theory.': [16], 'The': [17, 173], 'model': [19, 97, 163, 168], 'consists': [20], 'of': [21, 24, 64, 84, 104, 176, 188, 198], 'trigram': [23, 183], 'utterances': [25, 117], 'classified': [26], 'by': [27, 51, 108, 164], 'their': [28], 'act.': [30], 'It': [31, 121], 'can': [32, 77, 98, 158], 'be': [33], 'used': [34, 141, 150], 'to': [35], 'rule': [36], 'out': [37], 'erroneous': [38], 'recognition': [40], 'candidates': [41, 56], 'that': [42, 94, 114, 156, 187], 'are': [43], 'syntactically': [44], 'semantically': [46], 'correct,': [47], 'but': [48], 'contextually': [49], 'incorrect,': [50], 'examining': [52], 'whether': [53], 'utterance': [55, 107], 'form': [57], 'natural': [59], 'local': [60, 465], 'discourse': [61, 90], 'in': [62, 372], 'terms': [63], 'sequencing.': [67], 'Since': [68], 'it': [69], 'is': [70, 184, 192, 202], 'theory,': [75], 'we': [76, 157], 'define': [78], 'objective': [79], 'measures': [80], 'for': [81, 126, 132, 142, 151], 'quality': [83], 'model,': [87], 'such': [88], 'as': [89], 'perplexity.': [91], 'show': [93, 155], 'predict': [99], 'type': [103, 182, 527], 'next': [106], 'experiments': [109], '100': [111, 206, 295, 508, 594], 'keyboard': [112, 207], 'dialogues,': [113, 595], 'include': [115], '2,722': [116], '38,954': [119], 'words.': [120], 'achieves': [122], '39.7%': [123, 556], 'prediction': [124], 'accuracy': [125], 'top': [128, 134], 'candidate': [129], '61.7%': [131, 563], 'three': [135], 'candidates,': [136], 'when': [137, 194], '90': [138, 337, 536], 'dialogues': [139, 148, 509, 537, 544], 'were': [140, 149], 'training': [143], 'remaining': [146], '10': [147, 343, 542], 'testing.': [152], 'also': [154], 'make': [159], 'better': [161], 'language': [162, 200], 'combining': [165], 'with': [169, 179], 'sentence': [171], 'model.': [172], 'word': [174, 177, 190, 196], 'perplexity': [175, 197], 'bigram': [178, 191], '7.27,': [185], 'while': [186], 'simple': [189, 618], '11.6,': [193], 'models': [201], 'computed': [203], 'using': [204], 'dialogues.': [208], 'Eine': [209], 'statistische': [210], 'Dialogmodellmethode,': [211], 'die': [212, 231, 332, 358, 382], 'auf': [213, 274], 'der': [214, 253, 269, 275, 300, 314, 370], 'Informations-': [215], 'und': [216, 245, 303, 329, 342], 'Sprechakttheorie': [217], 'basiert': [218], 'wird': [219, 307, 352], 'hierin': [220], 'vorgeschlagen.': [221], 'Das': [222], 'Dialogmodell': [223], 'besteht': [224], 'aus': [225], 'einem': [226, 298], 'Trigramm': [227], 'von': [228, 294], 'sprachlichen': [229, 316], 'Äuβerungen,': [230], 'dem': [232, 363], 'Sprechakt': [233], 'nach': [234, 252], 'kalssifiziert': [235], 'sind.': [236], 'Es': [237, 320], 'kann': [238], 'dazu': [239], 'benutzt': [240, 348], 'werden,': [241], 'fehlerhafte': [242], 'Kandidaten': [243, 328], '(syntaktisch': [244], 'semantisch': [246], 'korrekt,': [247], 'aber': [248], 'im': [249, 267], 'Kontext': [250], 'falsch)': [251], 'Spracherkennung': [254], 'zu': [255], 'eliminieren,': [256], 'indem': [257], 'untersucht': [258], 'wird,': [259], 'ob': [260], 'diese': [261], 'Äuβerungskandidaten': [262], 'ein': [263], 'natürliches': [264], 'lokales': [265], 'Gespräch': [266], 'Sinne': [268], 'Sprechaktafolge': [270], 'bilden.': [271], 'Da': [272], 'es': [273], 'Informationstheorie': [276], 'basiert,': [277], 'können': [278], 'wir': [279], 'objektive': [280], 'Messungen': [281], 'zur': [282, 289], 'Qualität': [283], 'des': [284, 291, 360, 377, 408, 473, 490, 600, 606], 'Dialogmodells': [285, 361], 'durchführen,': [286], 'wie': [287], 'z.B.': [288], 'Komplexität': [290, 369], 'Gesprächs.': [292], 'Anhand': [293], 'Experimenten': [296], 'mit': [297, 362, 374], 'Tastaturdialog,': [299], '2722': [301, 513], 'Äuβerungen': [302], '38954': [304, 516], 'Wörter': [305], 'umfaβt,': [306], 'gezeigt,': [308, 353], 'daβ': [309, 354], 'dieses': [310], 'Modell': [311], 'den': [312, 326, 346, 375], 'Sprachakttype': [313], 'nächsten': [315], 'Äuβerung': [317], 'voraussagen': [318], 'kann.': [319, 367], 'erreicht': [321], '39,7%': [322], 'korrekter': [323], 'Voraussagen': [324], 'für': [325, 331, 339, 345], 'ersten': [327, 334], '61,7%': [330], 'drei': [333], 'Kandidaten,': [335], 'wenn': [336], 'Dialoge': [338, 344], 'das': [340, 355], 'Training': [341], 'Test': [347], 'werden.': [349], 'Des': [350, 505], 'weiteren': [351], 'Sprachmodell': [356], 'durch': [357], 'Kombination': [359], 'Satzmodell': [364], 'verbessert': [365], 'werden': [366], 'Die': [368], 'Wortbigramme': [371], 'Verbindung': [373], 'Trigrammen': [376], 'Sprechakts': [378], 'beträgt': [379], '7,27,': [380], 'wobei': [381], 'eines': [383], 'einfachen': [384], 'Wortbigramms': [385], 'nur': [386], 'bei': [387], '1,16': [388], 'liegt.': [389], 'Nous': [390, 569], 'proposons': [391], 'une': [392], 'méthode': [393], 'statistique': [394], 'de': [395, 403, 410, 423, 426, 438, 440, 453, 469, 475, 485, 493, 498, 522, 529, 531, 552, 555, 562, 578, 584, 589, 598, 602, 609, 612, 620, 623], 'modélisation': [396], 'du': [397, 414, 467, 471, 496, 503], 'basée': [399], 'sur': [400, 482, 507, 592], 'la': [401, 406, 436, 483, 494, 501, 596], 'théorie': [402, 407, 484], "l'information": [404], 'et': [405, 446, 515, 540, 561], 'actes': [409, 425, 474], 'langage.': [411, 476], 'Le': [412], 'modèle': [413, 479, 497, 521, 577, 583, 588], 'consiste': [416], 'en': [417, 421, 456, 580], 'trigrammes': [418, 607], "d'énoncés": [419], 'classés': [420], 'fonction': [422], 'leurs': [424], 'langage': [427, 530, 579, 610], 'associés.': [428], 'Il': [429], 'peut': [430, 524, 574], 'être': [431], 'utilisé': [432], 'pour': [433, 538, 545, 557, 564], 'éliminer,': [434], 'à': [435, 605], 'sortie': [437], "l'étage": [439], 'reconnaissance,': [441], 'les': [442, 541, 565, 593], 'candidats': [443, 461], 'erronés': [444], '(syntaxiquement': [445], 'sémantiquement': [447], 'corrects': [448], 'mais': [449], 'incorrects': [450], "d'un": [451, 617], 'point': [452, 468], 'vue': [454, 470], 'contextuel)': [455], 'examinant': [457], 'si': [458], 'ces': [459], 'énoncés': [460, 514], 'forment': [462], 'un': [463, 550], 'discours': [464], 'naturel': [466], 'séquencement': [472], 'Comme': [477], 'ce': [478, 520], 'est': [480, 611, 622], 'basé': [481], "l'information,": [486], 'nous': [487], 'pouvons': [488], 'définir': [489], 'mesures': [491], 'objectives': [492], 'qualité': [495], 'dialogue,': [499], 'comme': [500], 'perplexité': [502, 597], 'discours.': [504], 'expériences': [506], 'au': [510], 'clavier,': [511], 'incluant': [512], 'mots,': [517], 'montrent': [518], 'que': [519, 572, 615], 'prédire': [525], 'le': [526, 546, 558, 576, 582, 587], "d'acte": [528], "l'énoncé": [532], 'subséquent.': [533], 'En': [534], 'utilisant': [535], "l'apprentissage": [539], 'autres': [543], 'test,': [547], 'obtient': [549], 'score': [551], 'prédiction': [553], 'correcte': [554], 'premier': [559], 'candidat': [560], '3': [566], 'premiers': [567], 'candidats.': [568], 'montrons': [570], 'également': [571], "l'on": [573], 'améliorer': [575], 'combinant': [581], 'avec': [586], 'phrases.': [590], 'Calculée': [591], 'mots': [599, 603, 621], 'bigrammes': [601], 'associés': [604], "d'actes": [608], '7.27': [613], 'alors': [614], 'celle': [616], 'bigramme': [619], '1.16.': [624]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W2016971098', 'counts_by_year': [{'year': 2023, 'cited_by_count': 1}, {'year': 2021, 'cited_by_count': 1}, {'year': 2020, 'cited_by_count': 2}, {'year': 2016, 'cited_by_count': 1}, {'year': 2015, 'cited_by_count': 4}, {'year': 2014, 'cited_by_count': 1}, {'year': 2013, 'cited_by_count': 2}, {'year': 2012, 'cited_by_count': 3}], 'updated_date': '2024-08-31T17:20:19.256231', 'created_date': '2016-06-24'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works