Scaling number of cores in GPGPU: A comparative performance analysis

Winnie Thomas; Rohin Daruwala
{'id': 'https://openalex.org/W1642509926', 'doi': 'https://doi.org/10.1109/icacci.2015.7275658', 'title': 'Scaling number of cores in GPGPU: A comparative performance analysis', 'display_name': 'Scaling number of cores in GPGPU: A comparative performance analysis', 'publication_year': 2015, 'publication_date': '2015-08-01', 'ids': {'openalex': 'https://openalex.org/W1642509926', 'doi': 'https://doi.org/10.1109/icacci.2015.7275658', 'mag': '1642509926'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/icacci.2015.7275658', 'pdf_url': None, 'source': None, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'proceedings-article', 'indexed_in': ['crossref'], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5011782290', 'display_name': 'Winnie Thomas', 'orcid': 'https://orcid.org/0000-0001-8341-2403'}, 'institutions': [], 'countries': ['IN'], 'is_corresponding': False, 'raw_author_name': 'Winnie Thomas', 'raw_affiliation_strings': ['[Department of Electrical Engineering, Veermata Jijabai Technological Institute, Mumbai, INDIA]'], 'affiliations': [{'raw_affiliation_string': '[Department of Electrical Engineering, Veermata Jijabai Technological Institute, Mumbai, INDIA]', 'institution_ids': []}]}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5015094108', 'display_name': 'Rohin Daruwala', 'orcid': 'https://orcid.org/0000-0002-3267-6270'}, 'institutions': [], 'countries': ['IN'], 'is_corresponding': False, 'raw_author_name': 'Rohin D. Daruwala', 'raw_affiliation_strings': ['[Department of Electrical Engineering, Veermata Jijabai Technological Institute, Mumbai, INDIA]'], 'affiliations': [{'raw_affiliation_string': '[Department of Electrical Engineering, Veermata Jijabai Technological Institute, Mumbai, INDIA]', 'institution_ids': []}]}], 'countries_distinct_count': 1, 'institutions_distinct_count': 0, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 0.0, 'has_fulltext': True, 'fulltext_origin': 'ngrams', 'cited_by_count': 0, 'citation_normalized_percentile': {'value': 0.0, 'is_in_top_1_percent': False, 'is_in_top_10_percent': False}, 'cited_by_percentile_year': {'min': 0, 'max': 66}, 'biblio': {'volume': None, 'issue': None, 'first_page': None, 'last_page': None}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T10054', 'display_name': 'Parallel Computing and Performance Optimization', 'score': 1.0, 'subfield': {'id': 'https://openalex.org/subfields/1708', 'display_name': 'Hardware and Architecture'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T10054', 'display_name': 'Parallel Computing and Performance Optimization', 'score': 1.0, 'subfield': {'id': 'https://openalex.org/subfields/1708', 'display_name': 'Hardware and Architecture'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T11181', 'display_name': 'Distributed Storage Systems and Network Coding', 'score': 0.9994, 'subfield': {'id': 'https://openalex.org/subfields/1705', 'display_name': 'Computer Networks and Communications'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10829', 'display_name': 'Networks on Chip in System-on-Chip Design', 'score': 0.999, 'subfield': {'id': 'https://openalex.org/subfields/1705', 'display_name': 'Computer Networks and Communications'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/shader', 'display_name': 'Shader', 'score': 0.6593544}, {'id': 'https://openalex.org/keywords/gpu-computing', 'display_name': 'GPU Computing', 'score': 0.620268}, {'id': 'https://openalex.org/keywords/parallel-computing', 'display_name': 'Parallel Computing', 'score': 0.601245}, {'id': 'https://openalex.org/keywords/multicore-architectures', 'display_name': 'Multicore Architectures', 'score': 0.597245}, {'id': 'https://openalex.org/keywords/multi-core-processors', 'display_name': 'Multi-core Processors', 'score': 0.56687}, {'id': 'https://openalex.org/keywords/system-on-chip', 'display_name': 'System-on-Chip', 'score': 0.529079}], 'concepts': [{'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.8714062}, {'id': 'https://openalex.org/C173608175', 'wikidata': 'https://www.wikidata.org/wiki/Q232661', 'display_name': 'Parallel computing', 'level': 1, 'score': 0.8241774}, {'id': 'https://openalex.org/C177681979', 'wikidata': 'https://www.wikidata.org/wiki/Q633182', 'display_name': 'Shader', 'level': 3, 'score': 0.6593544}, {'id': 'https://openalex.org/C138101251', 'wikidata': 'https://www.wikidata.org/wiki/Q213092', 'display_name': 'Thread (computing)', 'level': 2, 'score': 0.60782456}, {'id': 'https://openalex.org/C42992933', 'wikidata': 'https://www.wikidata.org/wiki/Q691169', 'display_name': 'Task parallelism', 'level': 3, 'score': 0.6062615}, {'id': 'https://openalex.org/C50630238', 'wikidata': 'https://www.wikidata.org/wiki/Q971505', 'display_name': 'General-purpose computing on graphics processing units', 'level': 3, 'score': 0.47151056}, {'id': 'https://openalex.org/C99844830', 'wikidata': 'https://www.wikidata.org/wiki/Q102441924', 'display_name': 'Scaling', 'level': 2, 'score': 0.45794222}, {'id': 'https://openalex.org/C2781172179', 'wikidata': 'https://www.wikidata.org/wiki/Q853109', 'display_name': 'Parallelism (grammar)', 'level': 2, 'score': 0.33855963}, {'id': 'https://openalex.org/C118524514', 'wikidata': 'https://www.wikidata.org/wiki/Q173212', 'display_name': 'Computer architecture', 'level': 1, 'score': 0.33423647}, {'id': 'https://openalex.org/C111919701', 'wikidata': 'https://www.wikidata.org/wiki/Q9135', 'display_name': 'Operating system', 'level': 1, 'score': 0.21224111}, {'id': 'https://openalex.org/C21442007', 'wikidata': 'https://www.wikidata.org/wiki/Q1027879', 'display_name': 'Graphics', 'level': 2, 'score': 0.15267736}, {'id': 'https://openalex.org/C2524010', 'wikidata': 'https://www.wikidata.org/wiki/Q8087', 'display_name': 'Geometry', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C33923547', 'wikidata': 'https://www.wikidata.org/wiki/Q395', 'display_name': 'Mathematics', 'level': 0, 'score': 0.0}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/icacci.2015.7275658', 'pdf_url': None, 'source': None, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 16, 'referenced_works': ['https://openalex.org/W1559378205', 'https://openalex.org/W1762731526', 'https://openalex.org/W1979527452', 'https://openalex.org/W1983235612', 'https://openalex.org/W1984222112', 'https://openalex.org/W2000335122', 'https://openalex.org/W2001828452', 'https://openalex.org/W2048441570', 'https://openalex.org/W2079038734', 'https://openalex.org/W2080592089', 'https://openalex.org/W2090584832', 'https://openalex.org/W2128022558', 'https://openalex.org/W2142444503', 'https://openalex.org/W2155503253', 'https://openalex.org/W2614852291', 'https://openalex.org/W4235870392'], 'related_works': ['https://openalex.org/W74409296', 'https://openalex.org/W305742777', 'https://openalex.org/W2950520577', 'https://openalex.org/W2940653809', 'https://openalex.org/W2567390125', 'https://openalex.org/W2468095077', 'https://openalex.org/W2121547511', 'https://openalex.org/W2003935582', 'https://openalex.org/W1638215063', 'https://openalex.org/W1554644772'], 'abstract_inverted_index': {'The': [0, 85, 178], 'Single': [1], 'Instruction': [2, 19], 'Multiple': [3, 18, 20], 'Thread': [4, 46], '(SIMT)': [5], 'architecture': [6], 'based,': [7], 'Graphic': [8], 'Processing': [9], 'Units': [10], '(GPUs)': [11], 'are': [12, 42, 55, 65, 164], 'emerging': [13], 'as': [14, 58], 'more': [15], 'efficient': [16], 'than': [17], 'Data': [21], '(MIMD)': [22], 'architectures': [23], 'in': [24, 88, 161, 182, 241, 246], 'exploiting': [25, 110], 'parallelism.': [26], 'A': [27], 'GPU': [28, 91, 133], 'has': [29], 'numerous': [30], 'shader': [31, 98, 223], 'cores': [32, 99, 116, 170, 224], 'and': [33, 117, 144, 230], 'thousands': [34], 'of': [35, 109, 115, 123, 130, 169, 184, 200, 204, 222, 239, 243], 'simultaneous': [36], 'fine-grained': [37], 'active': [38], 'threads.': [39], 'These': [40], 'threads': [41, 51], 'grouped': [43, 57], 'into': [44, 135], 'Cooperative': [45], 'Arrays': [47], '(CTAs).': [48], 'All': [49], 'the': [50, 70, 136, 145, 156, 198, 237], 'within': [52, 62], 'a': [53, 63, 79, 90, 128, 231], 'CTA': [54, 64], 'further': [56], 'warps.': [59], 'Though': [60], 'warps': [61, 103], 'scheduled': [66], 'for': [67, 191, 207, 227], 'execution': [68], 'on': [69, 120, 202, 216], 'same': [71], 'core,': [72], 'only': [73], 'one': [74], 'warp': [75], 'is': [76, 94, 150, 189, 225], 'executed': [77], 'at': [78, 236], 'time': [80], 'due': [81], 'to': [82, 100, 233], 'hardware': [83], 'constraint.': [84], 'subsequent': [86], 'way': [87, 108], 'which': [89], 'exploits': [92], 'parallelism': [93, 111], 'by': [95, 112, 219], 'employing': [96], 'multiple': [97, 102], 'execute': [101], 'simultaneously.': [104], 'We': [105, 125, 153, 195], 'explore': [106], 'latter': [107], 'increasing': [113], 'number': [114, 129, 168, 221], 'its': [118, 162], 'impact': [119, 199], 'different': [121], 'types': [122], 'applications.': [124], 'first': [126], 'categorize': [127], 'general': [131], 'purpose': [132], 'workloads': [134, 157, 174, 229, 235], 'ones': [137, 146], 'that': [138, 155, 158, 213], 'consumes': [139], 'less': [140], '(L)': [141], 'DRAM': [142], 'bandwidth': [143, 148, 217], 'whose': [147], 'requirement': [149], 'heavier': [151], '(H).': [152], 'observed': [154, 197], 'get': [159], 'boost': [160, 232], 'performance': [163, 176, 180], 'under': [165], 'type-L': [166, 234], 'when': [167], 'increase.': [171], 'Whereas': [172], 'type-H': [173, 228], 'experience': [175], 'degradation.': [177], 'maximum': [179], 'gain': [181], 'terms': [183], 'instructions': [185], 'per': [186], 'cycle': [187], '(IPC),': [188], '2.03x': [190], 'type': [192], '-L': [193], 'workloads.': [194, 209], 'then': [196], 'scaling': [201, 220], 'percentage': [203, 242], 'good': [205, 244], 'cycles': [206, 245], 'all': [208], 'Our': [210], 'results': [211], 'show': [212], 'additional': [214], 'pressure': [215], 'caused': [218], 'detrimental': [226], 'cost': [238], 'reduction': [240], 'both': [247], 'types.': [248]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W1642509926', 'counts_by_year': [], 'updated_date': '2024-08-31T13:38:33.942048', 'created_date': '2016-06-24'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works