Simple and Effective Speech Enhancement for Visual Microphone

Juhyun Ahn; Dai‐Jin Kim
{'id': 'https://openalex.org/W2905188205', 'doi': 'https://doi.org/10.1109/acpr.2017.41', 'title': 'Simple and Effective Speech Enhancement for Visual Microphone', 'display_name': 'Simple and Effective Speech Enhancement for Visual Microphone', 'publication_year': 2017, 'publication_date': '2017-11-01', 'ids': {'openalex': 'https://openalex.org/W2905188205', 'doi': 'https://doi.org/10.1109/acpr.2017.41', 'mag': '2905188205'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/acpr.2017.41', 'pdf_url': None, 'source': None, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'proceedings-article', 'indexed_in': ['crossref'], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5072698580', 'display_name': 'Juhyun Ahn', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I123900574', 'display_name': 'Pohang University of Science and Technology', 'ror': 'https://ror.org/04xysgw12', 'country_code': 'KR', 'type': 'education', 'lineage': ['https://openalex.org/I123900574']}], 'countries': ['KR'], 'is_corresponding': False, 'raw_author_name': 'Juhyun Ahn', 'raw_affiliation_strings': ['POSTECH, Pohang, Korea'], 'affiliations': [{'raw_affiliation_string': 'POSTECH, Pohang, Korea', 'institution_ids': ['https://openalex.org/I123900574']}]}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5101431616', 'display_name': 'Dai‐Jin Kim', 'orcid': 'https://orcid.org/0000-0001-7148-993X'}, 'institutions': [{'id': 'https://openalex.org/I123900574', 'display_name': 'Pohang University of Science and Technology', 'ror': 'https://ror.org/04xysgw12', 'country_code': 'KR', 'type': 'education', 'lineage': ['https://openalex.org/I123900574']}], 'countries': ['KR'], 'is_corresponding': False, 'raw_author_name': 'Daijin Kim', 'raw_affiliation_strings': ['POSTECH, Pohang, Korea'], 'affiliations': [{'raw_affiliation_string': 'POSTECH, Pohang, Korea', 'institution_ids': ['https://openalex.org/I123900574']}]}], 'institution_assertions': [], 'countries_distinct_count': 1, 'institutions_distinct_count': 1, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 0.372, 'has_fulltext': True, 'fulltext_origin': 'ngrams', 'cited_by_count': 6, 'citation_normalized_percentile': {'value': 0.659479, 'is_in_top_1_percent': False, 'is_in_top_10_percent': False}, 'cited_by_percentile_year': {'min': 81, 'max': 83}, 'biblio': {'volume': None, 'issue': None, 'first_page': None, 'last_page': None}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T10860', 'display_name': 'Speech Enhancement Techniques', 'score': 1.0, 'subfield': {'id': 'https://openalex.org/subfields/1711', 'display_name': 'Signal Processing'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T10860', 'display_name': 'Speech Enhancement Techniques', 'score': 1.0, 'subfield': {'id': 'https://openalex.org/subfields/1711', 'display_name': 'Signal Processing'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T11233', 'display_name': 'Adaptive Filtering in Non-Gaussian Signal Processing', 'score': 0.9983, 'subfield': {'id': 'https://openalex.org/subfields/2206', 'display_name': 'Computational Mechanics'}, 'field': {'id': 'https://openalex.org/fields/22', 'display_name': 'Engineering'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10688', 'display_name': 'Image Denoising Techniques and Algorithms', 'score': 0.9949, 'subfield': {'id': 'https://openalex.org/subfields/1707', 'display_name': 'Computer Vision and Pattern Recognition'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/audio-visual-speech-recognition', 'display_name': 'Audio-Visual Speech Recognition', 'score': 0.604406}, {'id': 'https://openalex.org/keywords/speech-enhancement', 'display_name': 'Speech Enhancement', 'score': 0.582468}, {'id': 'https://openalex.org/keywords/noise-canceling-microphone', 'display_name': 'Noise-canceling microphone', 'score': 0.548844}, {'id': 'https://openalex.org/keywords/active-noise-control', 'display_name': 'Active Noise Control', 'score': 0.508246}, {'id': 'https://openalex.org/keywords/noise-reduction', 'display_name': 'Noise Reduction', 'score': 0.504727}, {'id': 'https://openalex.org/keywords/cepstrum', 'display_name': 'Cepstrum', 'score': 0.4759376}, {'id': 'https://openalex.org/keywords/signal', 'display_name': 'SIGNAL (programming language)', 'score': 0.41402957}], 'concepts': [{'id': 'https://openalex.org/C2778263558', 'wikidata': 'https://www.wikidata.org/wiki/Q46384', 'display_name': 'Microphone', 'level': 3, 'score': 0.8480494}, {'id': 'https://openalex.org/C2776182073', 'wikidata': 'https://www.wikidata.org/wiki/Q7575395', 'display_name': 'Speech enhancement', 'level': 3, 'score': 0.7874539}, {'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.7548529}, {'id': 'https://openalex.org/C28490314', 'wikidata': 'https://www.wikidata.org/wiki/Q189436', 'display_name': 'Speech recognition', 'level': 1, 'score': 0.64141405}, {'id': 'https://openalex.org/C36922181', 'wikidata': 'https://www.wikidata.org/wiki/Q7047650', 'display_name': 'Noise-canceling microphone', 'level': 5, 'score': 0.548844}, {'id': 'https://openalex.org/C99498987', 'wikidata': 'https://www.wikidata.org/wiki/Q2210247', 'display_name': 'Noise (video)', 'level': 3, 'score': 0.47702265}, {'id': 'https://openalex.org/C88485024', 'wikidata': 'https://www.wikidata.org/wiki/Q1054571', 'display_name': 'Cepstrum', 'level': 2, 'score': 0.4759376}, {'id': 'https://openalex.org/C106131492', 'wikidata': 'https://www.wikidata.org/wiki/Q3072260', 'display_name': 'Filter (signal processing)', 'level': 2, 'score': 0.46311155}, {'id': 'https://openalex.org/C61328038', 'wikidata': 'https://www.wikidata.org/wiki/Q3358061', 'display_name': 'Speech processing', 'level': 2, 'score': 0.42117143}, {'id': 'https://openalex.org/C2779843651', 'wikidata': 'https://www.wikidata.org/wiki/Q7390335', 'display_name': 'SIGNAL (programming language)', 'level': 2, 'score': 0.41402957}, {'id': 'https://openalex.org/C24890656', 'wikidata': 'https://www.wikidata.org/wiki/Q82811', 'display_name': 'Acoustics', 'level': 1, 'score': 0.38972822}, {'id': 'https://openalex.org/C2778806681', 'wikidata': 'https://www.wikidata.org/wiki/Q907293', 'display_name': 'Microphone array', 'level': 4, 'score': 0.3718552}, {'id': 'https://openalex.org/C31972630', 'wikidata': 'https://www.wikidata.org/wiki/Q844240', 'display_name': 'Computer vision', 'level': 1, 'score': 0.32398432}, {'id': 'https://openalex.org/C68115822', 'wikidata': 'https://www.wikidata.org/wiki/Q1068172', 'display_name': 'Sound pressure', 'level': 2, 'score': 0.14041355}, {'id': 'https://openalex.org/C121332964', 'wikidata': 'https://www.wikidata.org/wiki/Q413', 'display_name': 'Physics', 'level': 0, 'score': 0.09631321}, {'id': 'https://openalex.org/C76155785', 'wikidata': 'https://www.wikidata.org/wiki/Q418', 'display_name': 'Telecommunications', 'level': 1, 'score': 0.067082405}, {'id': 'https://openalex.org/C115961682', 'wikidata': 'https://www.wikidata.org/wiki/Q860623', 'display_name': 'Image (mathematics)', 'level': 2, 'score': 0.06351179}, {'id': 'https://openalex.org/C199360897', 'wikidata': 'https://www.wikidata.org/wiki/Q9143', 'display_name': 'Programming language', 'level': 1, 'score': 0.0}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/acpr.2017.41', 'pdf_url': None, 'source': None, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [{'id': 'https://metadata.un.org/sdg/16', 'display_name': 'Peace, justice, and strong institutions', 'score': 0.52}], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 16, 'referenced_works': ['https://openalex.org/W1546892833', 'https://openalex.org/W1977985044', 'https://openalex.org/W1984026713', 'https://openalex.org/W1990370049', 'https://openalex.org/W1998391547', 'https://openalex.org/W2013608223', 'https://openalex.org/W2031696998', 'https://openalex.org/W2101042021', 'https://openalex.org/W2109812093', 'https://openalex.org/W2124149378', 'https://openalex.org/W2128653836', 'https://openalex.org/W2129966346', 'https://openalex.org/W2147271331', 'https://openalex.org/W2294052718', 'https://openalex.org/W2609194088', 'https://openalex.org/W4243704736'], 'related_works': ['https://openalex.org/W4200596008', 'https://openalex.org/W2618657287', 'https://openalex.org/W2389562147', 'https://openalex.org/W2385432413', 'https://openalex.org/W2379497378', 'https://openalex.org/W2348931051', 'https://openalex.org/W2149163000', 'https://openalex.org/W2122951924', 'https://openalex.org/W2109356272', 'https://openalex.org/W2092524451'], 'abstract_inverted_index': {'Visual': [0], 'microphone': [1, 24, 54, 81, 170, 189], 'is': [2, 25, 82, 96, 165, 171], 'a': [3, 10, 46, 64, 109, 191], 'technique': [4], 'that': [5, 56, 73, 112, 159, 164, 182], 'recovers': [6], 'the': [7, 22, 28, 70, 74, 79, 86, 124, 131, 160, 168, 177, 187], 'sound': [8, 18, 41, 75], 'from': [9, 78, 118], 'silent': [11], 'video.': [12], 'The': [13, 99], 'simplest': [14], 'way': [15], 'to': [16, 108, 143, 167, 186], 'improve': [17], 'recovery': [19], 'performance': [20, 129], 'of': [21, 138], 'visual': [23, 53, 80, 169, 188], 'by': [26], 'applying': [27], 'traditional': [29, 132, 178], 'speech': [30, 50, 114, 133, 179], 'enhancement': [31, 51, 134, 180], 'algorithms': [32, 135], 'which': [33, 68], 'are': [34, 116, 183], 'based': [35], 'on': [36], 'complicated': [37], 'filter': [38], 'designs': [39], 'or': [40], 'models.': [42], 'This': [43], 'paper': [44], 'proposes': [45], 'simple': [47, 173], 'and': [48, 85, 93, 150, 174], 'effective': [49, 175], 'for': [52], '(SEVM)': [55], 'suppress': [57], 'spectrum': [58, 76, 88], 'components': [59], 'with': [60], 'small': [61], 'amplitude': [62], 'than': [63, 130, 176], 'predefined': [65], 'threshold': [66], 'value,': [67], 'exploits': [69], 'unique': [71], 'properties': [72], 'recovered': [77, 117], 'relatively': [83, 97], 'high': [84], 'noise': [87, 144], 'generated': [89], 'motion': [90], 'estimation': [91], 'error': [92], 'damped': [94], 'oscillation': [95], 'low.': [98], 'proposed': [100, 125, 161], 'SEVM': [101, 126, 162], 'method': [102, 127, 163], 'can': [103], 'also': [104], 'be': [105], 'easily': [106], 'extended': [107, 185], 'multichannel': [110], 'case': [111], 'multiple': [113, 119], 'signals': [115], 'cameras.': [120], 'Experimental': [121], 'results': [122], 'show': [123], 'better': [128], 'in': [136], 'terms': [137], 'log-likelihood': [139], 'ratio': [140, 145], '(LLR),': [141], 'signal': [142], '(SNR),': [146], 'segmental': [147], 'SNR': [148], '(SegSNR)': [149], 'cepstral': [151], 'distance': [152], '(CEP).': [153], 'From': [154], 'these': [155], 'results,': [156], 'we': [157], 'convince': [158], 'adapted': [166], 'really': [172], 'methods': [181], 'just': [184], 'as': [190], 'post-processing.': [192]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W2905188205', 'counts_by_year': [{'year': 2024, 'cited_by_count': 1}, {'year': 2023, 'cited_by_count': 1}, {'year': 2022, 'cited_by_count': 2}, {'year': 2020, 'cited_by_count': 2}], 'updated_date': '2024-09-19T03:18:37.609453', 'created_date': '2018-12-22'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works