Improving the Shuffle of Hadoop MapReduce

Jingui Li; Xuelian Lin; Xiaolong Cui; Yue Ye
{'id': 'https://openalex.org/W2077714512', 'doi': 'https://doi.org/10.1109/cloudcom.2013.42', 'title': 'Improving the Shuffle of Hadoop MapReduce', 'display_name': 'Improving the Shuffle of Hadoop MapReduce', 'publication_year': 2013, 'publication_date': '2013-12-01', 'ids': {'openalex': 'https://openalex.org/W2077714512', 'doi': 'https://doi.org/10.1109/cloudcom.2013.42', 'mag': '2077714512'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/cloudcom.2013.42', 'pdf_url': None, 'source': None, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'proceedings-article', 'indexed_in': ['crossref'], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5101498439', 'display_name': 'Jingui Li', 'orcid': 'https://orcid.org/0000-0002-5636-6699'}, 'institutions': [{'id': 'https://openalex.org/I82880672', 'display_name': 'Beihang University', 'ror': 'https://ror.org/00wk2mp56', 'country_code': 'CN', 'type': 'education', 'lineage': ['https://openalex.org/I82880672']}], 'countries': ['CN'], 'is_corresponding': False, 'raw_author_name': 'Jingui Li', 'raw_affiliation_strings': ['School of Computer Science and Engineering, Beihang University , Beijing, China'], 'affiliations': [{'raw_affiliation_string': 'School of Computer Science and Engineering, Beihang University , Beijing, China', 'institution_ids': ['https://openalex.org/I82880672']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5102880620', 'display_name': 'Xuelian Lin', 'orcid': 'https://orcid.org/0000-0002-9877-7162'}, 'institutions': [{'id': 'https://openalex.org/I82880672', 'display_name': 'Beihang University', 'ror': 'https://ror.org/00wk2mp56', 'country_code': 'CN', 'type': 'education', 'lineage': ['https://openalex.org/I82880672']}], 'countries': ['CN'], 'is_corresponding': False, 'raw_author_name': 'Xuelian Lin', 'raw_affiliation_strings': ['School of Computer Science and Engineering, Beihang University , Beijing, China'], 'affiliations': [{'raw_affiliation_string': 'School of Computer Science and Engineering, Beihang University , Beijing, China', 'institution_ids': ['https://openalex.org/I82880672']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5101857581', 'display_name': 'Xiaolong Cui', 'orcid': 'https://orcid.org/0000-0003-2235-0602'}, 'institutions': [{'id': 'https://openalex.org/I82880672', 'display_name': 'Beihang University', 'ror': 'https://ror.org/00wk2mp56', 'country_code': 'CN', 'type': 'education', 'lineage': ['https://openalex.org/I82880672']}], 'countries': ['CN'], 'is_corresponding': False, 'raw_author_name': 'Xiaolong Cui', 'raw_affiliation_strings': ['School of Computer Science and Engineering, Beihang University , Beijing, China'], 'affiliations': [{'raw_affiliation_string': 'School of Computer Science and Engineering, Beihang University , Beijing, China', 'institution_ids': ['https://openalex.org/I82880672']}]}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5024165209', 'display_name': 'Yue Ye', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I82880672', 'display_name': 'Beihang University', 'ror': 'https://ror.org/00wk2mp56', 'country_code': 'CN', 'type': 'education', 'lineage': ['https://openalex.org/I82880672']}], 'countries': ['CN'], 'is_corresponding': False, 'raw_author_name': 'Yue Ye', 'raw_affiliation_strings': ['School of Computer Science and Engineering, Beihang University , Beijing, China'], 'affiliations': [{'raw_affiliation_string': 'School of Computer Science and Engineering, Beihang University , Beijing, China', 'institution_ids': ['https://openalex.org/I82880672']}]}], 'countries_distinct_count': 1, 'institutions_distinct_count': 1, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 1.463, 'has_fulltext': True, 'fulltext_origin': 'ngrams', 'cited_by_count': 12, 'citation_normalized_percentile': {'value': 0.801172, 'is_in_top_1_percent': False, 'is_in_top_10_percent': False}, 'cited_by_percentile_year': {'min': 87, 'max': 88}, 'biblio': {'volume': None, 'issue': None, 'first_page': None, 'last_page': None}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T10101', 'display_name': 'Cloud Computing and Big Data Technologies', 'score': 0.9998, 'subfield': {'id': 'https://openalex.org/subfields/1710', 'display_name': 'Information Systems'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T10101', 'display_name': 'Cloud Computing and Big Data Technologies', 'score': 0.9998, 'subfield': {'id': 'https://openalex.org/subfields/1710', 'display_name': 'Information Systems'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10715', 'display_name': 'Distributed Grid Computing Systems', 'score': 0.9929, 'subfield': {'id': 'https://openalex.org/subfields/1705', 'display_name': 'Computer Networks and Communications'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T11181', 'display_name': 'Distributed Storage Systems and Network Coding', 'score': 0.9876, 'subfield': {'id': 'https://openalex.org/subfields/1705', 'display_name': 'Computer Networks and Communications'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/mapreduce', 'display_name': 'MapReduce', 'score': 0.591067}, {'id': 'https://openalex.org/keywords/hadoop', 'display_name': 'Hadoop', 'score': 0.573725}, {'id': 'https://openalex.org/keywords/task-scheduling', 'display_name': 'Task Scheduling', 'score': 0.554984}, {'id': 'https://openalex.org/keywords/parallel-computing', 'display_name': 'Parallel Computing', 'score': 0.50874}, {'id': 'https://openalex.org/keywords/map-reduce', 'display_name': 'Map reduce', 'score': 0.43173158}], 'concepts': [{'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.8690484}, {'id': 'https://openalex.org/C2780451532', 'wikidata': 'https://www.wikidata.org/wiki/Q759676', 'display_name': 'Task (project management)', 'level': 2, 'score': 0.6576855}, {'id': 'https://openalex.org/C76831024', 'wikidata': 'https://www.wikidata.org/wiki/Q5227096', 'display_name': 'Data-intensive computing', 'level': 4, 'score': 0.6342301}, {'id': 'https://openalex.org/C206729178', 'wikidata': 'https://www.wikidata.org/wiki/Q2271896', 'display_name': 'Scheduling (production processes)', 'level': 2, 'score': 0.62220556}, {'id': 'https://openalex.org/C120314980', 'wikidata': 'https://www.wikidata.org/wiki/Q180634', 'display_name': 'Distributed computing', 'level': 1, 'score': 0.52844405}, {'id': 'https://openalex.org/C2989134064', 'wikidata': 'https://www.wikidata.org/wiki/Q288510', 'display_name': 'Execution time', 'level': 2, 'score': 0.494518}, {'id': 'https://openalex.org/C173608175', 'wikidata': 'https://www.wikidata.org/wiki/Q232661', 'display_name': 'Parallel computing', 'level': 1, 'score': 0.47692484}, {'id': 'https://openalex.org/C75684735', 'wikidata': 'https://www.wikidata.org/wiki/Q858810', 'display_name': 'Big data', 'level': 2, 'score': 0.45222053}, {'id': 'https://openalex.org/C2780378061', 'wikidata': 'https://www.wikidata.org/wiki/Q25351891', 'display_name': 'Service (business)', 'level': 2, 'score': 0.4371286}, {'id': 'https://openalex.org/C3019257732', 'wikidata': 'https://www.wikidata.org/wiki/Q567759', 'display_name': 'Map reduce', 'level': 3, 'score': 0.43173158}, {'id': 'https://openalex.org/C77088390', 'wikidata': 'https://www.wikidata.org/wiki/Q8513', 'display_name': 'Database', 'level': 1, 'score': 0.3247248}, {'id': 'https://openalex.org/C124101348', 'wikidata': 'https://www.wikidata.org/wiki/Q172491', 'display_name': 'Data mining', 'level': 1, 'score': 0.25943607}, {'id': 'https://openalex.org/C70429105', 'wikidata': 'https://www.wikidata.org/wiki/Q249999', 'display_name': 'Grid computing', 'level': 3, 'score': 0.060884297}, {'id': 'https://openalex.org/C21547014', 'wikidata': 'https://www.wikidata.org/wiki/Q1423657', 'display_name': 'Operations management', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C2524010', 'wikidata': 'https://www.wikidata.org/wiki/Q8087', 'display_name': 'Geometry', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C33923547', 'wikidata': 'https://www.wikidata.org/wiki/Q395', 'display_name': 'Mathematics', 'level': 0, 'score': 0.0}, {'id': 'https://openalex.org/C187736073', 'wikidata': 'https://www.wikidata.org/wiki/Q2920921', 'display_name': 'Management', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C136264566', 'wikidata': 'https://www.wikidata.org/wiki/Q159810', 'display_name': 'Economy', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C162324750', 'wikidata': 'https://www.wikidata.org/wiki/Q8134', 'display_name': 'Economics', 'level': 0, 'score': 0.0}, {'id': 'https://openalex.org/C187691185', 'wikidata': 'https://www.wikidata.org/wiki/Q2020720', 'display_name': 'Grid', 'level': 2, 'score': 0.0}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/cloudcom.2013.42', 'pdf_url': None, 'source': None, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [{'score': 0.58, 'id': 'https://metadata.un.org/sdg/8', 'display_name': 'Decent work and economic growth'}], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 13, 'referenced_works': ['https://openalex.org/W110612056', 'https://openalex.org/W1968075755', 'https://openalex.org/W1985419898', 'https://openalex.org/W2035543557', 'https://openalex.org/W2048554864', 'https://openalex.org/W2057420573', 'https://openalex.org/W2121142115', 'https://openalex.org/W2125775320', 'https://openalex.org/W2140928415', 'https://openalex.org/W2154894831', 'https://openalex.org/W2173213060', 'https://openalex.org/W2912802084', 'https://openalex.org/W3139774558'], 'related_works': ['https://openalex.org/W2942908007', 'https://openalex.org/W2887618286', 'https://openalex.org/W2810393632', 'https://openalex.org/W2794953737', 'https://openalex.org/W2763794325', 'https://openalex.org/W2730326977', 'https://openalex.org/W2608358066', 'https://openalex.org/W2391034221', 'https://openalex.org/W2336247738', 'https://openalex.org/W2244881412'], 'abstract_inverted_index': {'As': [0], 'an': [1], 'efficient': [2, 60], 'parallel': [3], 'computing': [4], 'system': [5], 'based': [6], 'on': [7, 106], 'MapReduce': [8, 114], 'model,': [9], 'Hadoop': [10], 'is': [11], 'widely': [12], 'used': [13], 'for': [14], 'large-scale': [15], 'data': [16, 20], 'analysis': [17], 'such': [18], 'as': [19, 90, 98], 'mining,': [21], 'machine': [22], 'learning': [23], 'and': [24, 101, 113, 140], 'scientific': [25], 'simulation.': [26], 'However,': [27], 'there': [28], 'are': [29, 118], 'still': [30], 'some': [31], 'performance': [32, 123], 'problems': [33], 'in': [34, 39, 49, 71, 82], 'MapReduce,': [35], 'especially': [36], 'the': [37, 40, 67, 79, 95, 122, 135], 'situation': [38], 'shuffle': [41, 55, 69, 80, 86, 92, 96], 'phase.': [42], 'In': [43], 'order': [44], 'to': [45, 77, 120], 'solve': [46], 'these': [47], 'problems,': [48], 'this': [50], 'paper,': [51], 'a': [52, 91, 99], 'lightweight': [53], 'individual': [54], 'service': [56, 81, 100], 'component': [57], 'with': [58], 'more': [59], 'I/O': [61, 103], 'policy': [62, 105], 'was': [63], 'proposed': [64], 'rather': [65], 'than': [66], 'existing': [68], 'phase': [70], 'MapReduce.': [72], 'We': [73], 'also': [74], 'describe': [75], 'how': [76], 'implement': [78], 'three': [83], 'steps:': [84], 'extract': [85], 'from': [87], 'reduce': [88], 'task': [89, 97], 'task,': [93], 'reconstruct': [94], 'improve': [102], 'scheduling': [104], 'Map': [107], 'sides.': [108], 'Furthermore': [109], 'both': [110], 'simulated': [111], 'experiments': [112], 'job': [115], 'comparative': [116], 'studies': [117], 'conducted': [119], 'evaluate': [121], 'of': [124, 144], 'our': [125, 131], 'improvements.': [126], 'The': [127], 'result': [128], 'reveals': [129], 'that': [130], 'approach': [132], 'can': [133], 'decrease': [134], 'whole': [136], "job's": [137], 'execution': [138], 'time': [139], 'make': [141], 'full': [142], 'use': [143], 'cluster': [145], 'resources.': [146]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W2077714512', 'counts_by_year': [{'year': 2023, 'cited_by_count': 1}, {'year': 2019, 'cited_by_count': 1}, {'year': 2018, 'cited_by_count': 2}, {'year': 2017, 'cited_by_count': 2}, {'year': 2016, 'cited_by_count': 2}, {'year': 2015, 'cited_by_count': 3}, {'year': 2014, 'cited_by_count': 1}], 'updated_date': '2024-08-13T14:09:18.277474', 'created_date': '2016-06-24'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works