Handling Real-Word Errors of Hindi Language using N-gram and Confusion Set

Shashank Singh; Shailendra Singh
{'id': 'https://openalex.org/W2943596339', 'doi': 'https://doi.org/10.1109/aicai.2019.8701394', 'title': 'Handling Real-Word Errors of Hindi Language using N-gram and Confusion Set', 'display_name': 'Handling Real-Word Errors of Hindi Language using N-gram and Confusion Set', 'publication_year': 2019, 'publication_date': '2019-02-01', 'ids': {'openalex': 'https://openalex.org/W2943596339', 'doi': 'https://doi.org/10.1109/aicai.2019.8701394', 'mag': '2943596339'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/aicai.2019.8701394', 'pdf_url': None, 'source': None, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'proceedings-article', 'indexed_in': ['crossref'], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5101895192', 'display_name': 'Shashank Singh', 'orcid': 'https://orcid.org/0000-0002-7305-673X'}, 'institutions': [{'id': 'https://openalex.org/I860003557', 'display_name': 'Punjab Engineering College', 'ror': 'https://ror.org/00bsj2955', 'country_code': 'IN', 'type': 'education', 'lineage': ['https://openalex.org/I860003557']}], 'countries': ['IN'], 'is_corresponding': False, 'raw_author_name': 'Shashank Singh', 'raw_affiliation_strings': ['Department of CSE, Punjab Engineering College, Chandigarh, India'], 'affiliations': [{'raw_affiliation_string': 'Department of CSE, Punjab Engineering College, Chandigarh, India', 'institution_ids': ['https://openalex.org/I860003557']}]}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5102010101', 'display_name': 'Shailendra Singh', 'orcid': 'https://orcid.org/0000-0002-1761-5648'}, 'institutions': [{'id': 'https://openalex.org/I860003557', 'display_name': 'Punjab Engineering College', 'ror': 'https://ror.org/00bsj2955', 'country_code': 'IN', 'type': 'education', 'lineage': ['https://openalex.org/I860003557']}], 'countries': ['IN'], 'is_corresponding': False, 'raw_author_name': 'Shailendra Singh', 'raw_affiliation_strings': ['Department of CSE, Punjab Engineering College, Chandigarh, India'], 'affiliations': [{'raw_affiliation_string': 'Department of CSE, Punjab Engineering College, Chandigarh, India', 'institution_ids': ['https://openalex.org/I860003557']}]}], 'institution_assertions': [], 'countries_distinct_count': 1, 'institutions_distinct_count': 1, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 0.167, 'has_fulltext': True, 'fulltext_origin': 'ngrams', 'cited_by_count': 4, 'citation_normalized_percentile': {'value': 0.442578, 'is_in_top_1_percent': False, 'is_in_top_10_percent': False}, 'cited_by_percentile_year': {'min': 77, 'max': 80}, 'biblio': {'volume': None, 'issue': None, 'first_page': None, 'last_page': None}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T10181', 'display_name': 'Statistical Machine Translation and Natural Language Processing', 'score': 0.9981, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T10181', 'display_name': 'Statistical Machine Translation and Natural Language Processing', 'score': 0.9981, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T13629', 'display_name': 'Automatic Text Simplification and Readability Assessment', 'score': 0.9816, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T12031', 'display_name': 'Dialogue Act Modeling for Spoken Language Systems', 'score': 0.9451, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/bigram', 'display_name': 'Bigram', 'score': 0.9823648}, {'id': 'https://openalex.org/keywords/n-gram', 'display_name': 'n-gram', 'score': 0.7016819}, {'id': 'https://openalex.org/keywords/complex-word-identification', 'display_name': 'Complex Word Identification', 'score': 0.50103}], 'concepts': [{'id': 'https://openalex.org/C137546455', 'wikidata': 'https://www.wikidata.org/wiki/Q3213474', 'display_name': 'Trigram', 'level': 2, 'score': 0.984918}, {'id': 'https://openalex.org/C108757681', 'wikidata': 'https://www.wikidata.org/wiki/Q2773912', 'display_name': 'Bigram', 'level': 3, 'score': 0.9823648}, {'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.719481}, {'id': 'https://openalex.org/C90805587', 'wikidata': 'https://www.wikidata.org/wiki/Q10944557', 'display_name': 'Word (group theory)', 'level': 2, 'score': 0.7018318}, {'id': 'https://openalex.org/C117884012', 'wikidata': 'https://www.wikidata.org/wiki/Q94489', 'display_name': 'n-gram', 'level': 3, 'score': 0.7016819}, {'id': 'https://openalex.org/C519982507', 'wikidata': 'https://www.wikidata.org/wiki/Q1568', 'display_name': 'Hindi', 'level': 2, 'score': 0.64882493}, {'id': 'https://openalex.org/C204321447', 'wikidata': 'https://www.wikidata.org/wiki/Q30642', 'display_name': 'Natural language processing', 'level': 1, 'score': 0.5830705}, {'id': 'https://openalex.org/C28490314', 'wikidata': 'https://www.wikidata.org/wiki/Q189436', 'display_name': 'Speech recognition', 'level': 1, 'score': 0.5767196}, {'id': 'https://openalex.org/C154945302', 'wikidata': 'https://www.wikidata.org/wiki/Q11660', 'display_name': 'Artificial intelligence', 'level': 1, 'score': 0.54868364}, {'id': 'https://openalex.org/C81669768', 'wikidata': 'https://www.wikidata.org/wiki/Q2359161', 'display_name': 'Precision and recall', 'level': 2, 'score': 0.4546223}, {'id': 'https://openalex.org/C137293760', 'wikidata': 'https://www.wikidata.org/wiki/Q3621696', 'display_name': 'Language model', 'level': 2, 'score': 0.32961065}, {'id': 'https://openalex.org/C33923547', 'wikidata': 'https://www.wikidata.org/wiki/Q395', 'display_name': 'Mathematics', 'level': 0, 'score': 0.2233733}, {'id': 'https://openalex.org/C2524010', 'wikidata': 'https://www.wikidata.org/wiki/Q8087', 'display_name': 'Geometry', 'level': 1, 'score': 0.0}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/aicai.2019.8701394', 'pdf_url': None, 'source': None, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [{'display_name': 'Quality education', 'score': 0.76, 'id': 'https://metadata.un.org/sdg/4'}], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 12, 'referenced_works': ['https://openalex.org/W1566980291', 'https://openalex.org/W1647671624', 'https://openalex.org/W2010595692', 'https://openalex.org/W2038061533', 'https://openalex.org/W2081366726', 'https://openalex.org/W2118975565', 'https://openalex.org/W2148362501', 'https://openalex.org/W2157224915', 'https://openalex.org/W2294192051', 'https://openalex.org/W2558083976', 'https://openalex.org/W2897142519', 'https://openalex.org/W3105992103'], 'related_works': ['https://openalex.org/W7593531', 'https://openalex.org/W4327499987', 'https://openalex.org/W4288374102', 'https://openalex.org/W2950765678', 'https://openalex.org/W2940857995', 'https://openalex.org/W2921680427', 'https://openalex.org/W2917105722', 'https://openalex.org/W2463816369', 'https://openalex.org/W2250909759', 'https://openalex.org/W2164394510'], 'abstract_inverted_index': {'The': [0, 14, 132], 'two': [1], 'major': [2], 'typographic': [3], 'errors': [4, 10], 'of': [5, 46, 102, 124, 151], 'any': [6], 'language': [7, 48], 'are': [8, 68], 'non-word': [9], 'and': [11, 57, 66, 86, 94, 129, 159, 174, 179], 'real-word': [12, 44], 'errors.': [13], 'researchers': [15], 'have': [16], 'worked': [17], 'rigorously': [18], 'for': [19, 120, 143], 'the': [20, 24, 30, 43, 54, 75, 82, 87, 90, 95, 122, 125, 140, 144, 161, 171, 184, 191, 195], 'former': [21], 'error': [22, 45], 'but': [23], 'latter': [25], 'has': [26, 49, 154], 'not': [27], 'been': [28, 50, 155], 'given': [29], 'much': [31], 'attention': [32], 'all': [33, 121], 'this': [34, 37], 'while.': [35], 'In': [36, 182], 'paper,': [38], 'a': [39, 115], 'technique': [40], 'to': [41, 73, 138, 157, 189], 'identify': [42], 'Hindi': [47, 83, 91, 148], 'proposed': [51, 162], 'which': [52, 164], 'combines': [53], 'bigram,': [55], 'trigram': [56, 130], 'confusion': [58], 'set': [59], '(CS)': [60], 'methods.': [61], 'Left': [62], 'bigrams,': [63], 'right': [64, 79, 97], 'bigrams': [65], 'trigrams': [67], 'calculated': [69, 119, 133], 'by': [70, 193], 'taking': [71], 'in': [72], 'account': [74], 'immediate': [76, 78, 88, 96], 'left,': [77, 89], 'word': [80, 85, 93, 98], 'with': [81], 'test': [84, 92], 'respectively.': [99, 181], 'A': [100, 147], 'group': [101], 'most': [103], 'confusable': [104], 'words': [105, 153], 'is': [106, 118, 136], 'created': [107], 'using': [108, 127], 'Levenstein': [109], 'edit': [110], 'distance': [111], 'method.': [112], 'After': [113], 'that,': [114], 'composite': [116, 134], 'score': [117, 135], 'members': [123], 'CS': [126], 'bigram': [128], 'probabilities.': [131], 'used': [137, 156], 'prepare': [139], 'suggestion': [141], 'list': [142], 'erroneous': [145], 'word.': [146], 'text': [149], 'file': [150], '2000': [152], 'evaluate': [158], 'verify': [160], 'method': [163], 'offers': [165], 'considerably': [166], 'good': [167], 'results.': [168], 'It': [169], 'gives': [170], 'precision,': [172], 'recall': [173], 'F-score': [175], 'as': [176], '.70-.75,': [177], '.80-.85': [178], '.70-.80': [180], 'future,': [183], 'research': [185], 'can': [186], 'be': [187], 'done': [188], 'improve': [190], 'results': [192], 'considering': [194], 'whole': [196], 'sentence': [197], 'at': [198], 'once.': [199]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W2943596339', 'counts_by_year': [{'year': 2023, 'cited_by_count': 2}, {'year': 2022, 'cited_by_count': 2}], 'updated_date': '2024-09-18T10:47:26.488670', 'created_date': '2019-05-09'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works