An efficient k-means clustering algorithm: analysis and implementation

Name: Work Video:
Duration: 3 min 30 s
Tapas Kanungo; David M. Mount; Nathan S. Netanyahu; Christine Piatko; Ruth Silverman; Angela Y. Wu
{'id': 'https://openalex.org/W2161160262', 'doi': 'https://doi.org/10.1109/tpami.2002.1017616', 'title': 'An efficient k-means clustering algorithm: analysis and implementation', 'display_name': 'An efficient k-means clustering algorithm: analysis and implementation', 'publication_year': 2002, 'publication_date': '2002-07-01', 'ids': {'openalex': 'https://openalex.org/W2161160262', 'doi': 'https://doi.org/10.1109/tpami.2002.1017616', 'mag': '2161160262'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/tpami.2002.1017616', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S199944782', 'display_name': 'IEEE Transactions on Pattern Analysis and Machine Intelligence', 'issn_l': '0162-8828', 'issn': ['0162-8828', '1939-3539', '2160-9292'], 'is_oa': False, 'is_in_doaj': False, 'is_indexed_in_scopus': True, 'is_core': True, 'host_organization': 'https://openalex.org/P4310320439', 'host_organization_name': 'IEEE Computer Society', 'host_organization_lineage': ['https://openalex.org/P4310320439', 'https://openalex.org/P4310319808'], 'host_organization_lineage_names': ['IEEE Computer Society', 'Institute of Electrical and Electronics Engineers'], 'type': 'journal'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'journal-article', 'indexed_in': ['crossref'], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5052200294', 'display_name': 'Tapas Kanungo', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I1341412227', 'display_name': 'IBM (United States)', 'ror': 'https://ror.org/05hh8d621', 'country_code': 'US', 'type': 'company', 'lineage': ['https://openalex.org/I1341412227']}], 'countries': ['US'], 'is_corresponding': False, 'raw_author_name': 'T. Kanungo', 'raw_affiliation_strings': ['Almaden Res. Center, San Jose, CA, USA'], 'affiliations': [{'raw_affiliation_string': 'Almaden Res. Center, San Jose, CA, USA', 'institution_ids': ['https://openalex.org/I1341412227']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5016442699', 'display_name': 'David M. Mount', 'orcid': 'https://orcid.org/0000-0002-3290-8932'}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'D.M. Mount', 'raw_affiliation_strings': [], 'affiliations': []}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5024299231', 'display_name': 'Nathan S. Netanyahu', 'orcid': 'https://orcid.org/0000-0001-6648-9441'}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'N.S. Netanyahu', 'raw_affiliation_strings': [], 'affiliations': []}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5043500353', 'display_name': 'Christine Piatko', 'orcid': 'https://orcid.org/0000-0002-5295-9112'}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'C.D. Piatko', 'raw_affiliation_strings': [], 'affiliations': []}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5045335628', 'display_name': 'Ruth Silverman', 'orcid': None}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'R. Silverman', 'raw_affiliation_strings': [], 'affiliations': []}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5001811325', 'display_name': 'Angela Y. Wu', 'orcid': 'https://orcid.org/0000-0002-1993-284X'}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'A.Y. Wu', 'raw_affiliation_strings': [], 'affiliations': []}], 'institution_assertions': [], 'countries_distinct_count': 1, 'institutions_distinct_count': 1, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 26.682, 'has_fulltext': True, 'fulltext_origin': 'ngrams', 'cited_by_count': 5450, 'citation_normalized_percentile': {'value': 0.99954, 'is_in_top_1_percent': True, 'is_in_top_10_percent': True}, 'cited_by_percentile_year': {'min': 99, 'max': 100}, 'biblio': {'volume': '24', 'issue': '7', 'first_page': '881', 'last_page': '892'}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T11106', 'display_name': 'Data Management and Algorithms', 'score': 0.9992, 'subfield': {'id': 'https://openalex.org/subfields/1711', 'display_name': 'Signal Processing'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T11106', 'display_name': 'Data Management and Algorithms', 'score': 0.9992, 'subfield': {'id': 'https://openalex.org/subfields/1711', 'display_name': 'Signal Processing'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10637', 'display_name': 'Advanced Clustering Algorithms Research', 'score': 0.9973, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10901', 'display_name': 'Advanced Data Compression Techniques', 'score': 0.9959, 'subfield': {'id': 'https://openalex.org/subfields/1707', 'display_name': 'Computer Vision and Pattern Recognition'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/data-stream-clustering', 'display_name': 'Data stream clustering', 'score': 0.58721536}, {'id': 'https://openalex.org/keywords/algorithm-design', 'display_name': 'Algorithm design', 'score': 0.44998193}, {'id': 'https://openalex.org/keywords/data-point', 'display_name': 'Data point', 'score': 0.42676854}], 'concepts': [{'id': 'https://openalex.org/C73555534', 'wikidata': 'https://www.wikidata.org/wiki/Q622825', 'display_name': 'Cluster analysis', 'level': 2, 'score': 0.7880175}, {'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.6369666}, {'id': 'https://openalex.org/C11413529', 'wikidata': 'https://www.wikidata.org/wiki/Q8366', 'display_name': 'Algorithm', 'level': 1, 'score': 0.58984417}, {'id': 'https://openalex.org/C104047586', 'wikidata': 'https://www.wikidata.org/wiki/Q5033439', 'display_name': 'Canopy clustering algorithm', 'level': 4, 'score': 0.5881145}, {'id': 'https://openalex.org/C193143536', 'wikidata': 'https://www.wikidata.org/wiki/Q5227360', 'display_name': 'Data stream clustering', 'level': 5, 'score': 0.58721536}, {'id': 'https://openalex.org/C33704608', 'wikidata': 'https://www.wikidata.org/wiki/Q5014717', 'display_name': 'CURE data clustering algorithm', 'level': 4, 'score': 0.51704293}, {'id': 'https://openalex.org/C75930677', 'wikidata': 'https://www.wikidata.org/wiki/Q1251950', 'display_name': 'Ramer–Douglas–Peucker algorithm', 'level': 3, 'score': 0.47125795}, {'id': 'https://openalex.org/C78548338', 'wikidata': 'https://www.wikidata.org/wiki/Q2493', 'display_name': 'Data compression', 'level': 2, 'score': 0.4700747}, {'id': 'https://openalex.org/C106516650', 'wikidata': 'https://www.wikidata.org/wiki/Q8366', 'display_name': 'Algorithm design', 'level': 2, 'score': 0.44998193}, {'id': 'https://openalex.org/C21080849', 'wikidata': 'https://www.wikidata.org/wiki/Q13611879', 'display_name': 'Data point', 'level': 2, 'score': 0.42676854}, {'id': 'https://openalex.org/C17212007', 'wikidata': 'https://www.wikidata.org/wiki/Q5511111', 'display_name': 'Fuzzy clustering', 'level': 3, 'score': 0.37782332}, {'id': 'https://openalex.org/C154945302', 'wikidata': 'https://www.wikidata.org/wiki/Q11660', 'display_name': 'Artificial intelligence', 'level': 1, 'score': 0.32752988}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/tpami.2002.1017616', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S199944782', 'display_name': 'IEEE Transactions on Pattern Analysis and Machine Intelligence', 'issn_l': '0162-8828', 'issn': ['0162-8828', '1939-3539', '2160-9292'], 'is_oa': False, 'is_in_doaj': False, 'is_indexed_in_scopus': True, 'is_core': True, 'host_organization': 'https://openalex.org/P4310320439', 'host_organization_name': 'IEEE Computer Society', 'host_organization_lineage': ['https://openalex.org/P4310320439', 'https://openalex.org/P4310319808'], 'host_organization_lineage_names': ['IEEE Computer Society', 'Institute of Electrical and Electronics Engineers'], 'type': 'journal'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 58, 'referenced_works': ['https://openalex.org/W147860157', 'https://openalex.org/W1489608363', 'https://openalex.org/W1544875853', 'https://openalex.org/W1556911662', 'https://openalex.org/W1561994953', 'https://openalex.org/W1575476631', 'https://openalex.org/W1578468649', 'https://openalex.org/W1585610988', 'https://openalex.org/W1611682757', 'https://openalex.org/W1634005169', 'https://openalex.org/W1662487404', 'https://openalex.org/W1770825568', 'https://openalex.org/W1772739125', 'https://openalex.org/W1956647075', 'https://openalex.org/W1971784203', 'https://openalex.org/W1972881375', 'https://openalex.org/W1977541023', 'https://openalex.org/W1991848143', 'https://openalex.org/W1992419399', 'https://openalex.org/W1995718774', 'https://openalex.org/W1996538584', 'https://openalex.org/W1998905999', 'https://openalex.org/W1999668761', 'https://openalex.org/W2005314985', 'https://openalex.org/W2011039300', 'https://openalex.org/W2039511589', 'https://openalex.org/W2047588761', 'https://openalex.org/W2050744541', 'https://openalex.org/W2051752778', 'https://openalex.org/W2058182146', 'https://openalex.org/W2073849744', 'https://openalex.org/W2084134149', 'https://openalex.org/W2091283109', 'https://openalex.org/W2101616188', 'https://openalex.org/W2103020243', 'https://openalex.org/W2105366291', 'https://openalex.org/W2115665694', 'https://openalex.org/W2127218421', 'https://openalex.org/W2132549764', 'https://openalex.org/W2135346934', 'https://openalex.org/W2141245797', 'https://openalex.org/W2150593711', 'https://openalex.org/W2152255870', 'https://openalex.org/W2165558283', 'https://openalex.org/W2169351022', 'https://openalex.org/W2220451813', 'https://openalex.org/W2319660501', 'https://openalex.org/W23758216', 'https://openalex.org/W2427881153', 'https://openalex.org/W2751862591', 'https://openalex.org/W2753933954', 'https://openalex.org/W2911910046', 'https://openalex.org/W2952367943', 'https://openalex.org/W2999729612', 'https://openalex.org/W3017143921', 'https://openalex.org/W3137423196', 'https://openalex.org/W4212848460', 'https://openalex.org/W4298882835'], 'related_works': ['https://openalex.org/W4306940721', 'https://openalex.org/W3174322327', 'https://openalex.org/W3144143113', 'https://openalex.org/W2892323093', 'https://openalex.org/W2559422900', 'https://openalex.org/W2362911195', 'https://openalex.org/W2357149509', 'https://openalex.org/W2183916789', 'https://openalex.org/W2181939267', 'https://openalex.org/W2117838073'], 'abstract_inverted_index': {'In': [0], 'k-means': [1, 56, 71], 'clustering,': [2], 'we': [3, 75, 108, 132], 'are': [4], 'given': [5], 'a': [6, 27, 64, 87, 110, 134], 'set': [7, 28], 'of': [8, 29, 69, 100, 113, 136], 'n': [9], 'data': [10, 46, 93, 143, 147, 154], 'points': [11, 31], 'in': [12, 32, 104, 151], 'd-dimensional': [13], 'space': [14], 'R/sup': [15], 'd/': [16], 'and': [17, 21, 66, 144, 156], 'an': [18], 'integer': [19], 'k': [20, 30], 'the': [22, 40, 77, 90, 97, 101, 114, 121, 126], 'problem': [23], 'is': [24, 58, 82], 'to': [25, 38, 48, 84], 'determine': [26], 'Rd,': [33], 'called': [34], 'centers,': [35], 'so': [36], 'as': [37, 89, 125], 'minimize': [39], 'mean': [41], 'squared': [42], 'distance': [43], 'from': [44, 149], 'each': [45], 'point': [47], 'its': [49], 'nearest': [50], 'center.': [51], 'A': [52], 'popular': [53], 'heuristic': [54], 'for': [55], 'clustering': [57, 72], "Lloyd's": [59, 70], '(1982)': [60], 'algorithm.': [61, 79], 'We': [62, 95], 'present': [63, 109, 133], 'simple': [65], 'efficient': [67], 'implementation': [68], 'algorithm,': [73], 'which': [74, 118], 'call': [76], 'filtering': [78, 102], 'This': [80], 'algorithm': [81, 103, 122], 'easy': [83], 'implement,': [85], 'requiring': [86], 'kd-tree': [88], 'only': [91], 'major': [92], 'structure.': [94], 'establish': [96], 'practical': [98], 'efficiency': [99], 'two': [105], 'ways.': [106], 'First,': [107], 'data-sensitive': [111], 'analysis': [112], "algorithm's": [115], 'running': [116], 'time,': [117], 'shows': [119], 'that': [120], 'runs': [123], 'faster': [124], 'separation': [127], 'between': [128], 'clusters': [129], 'increases.': [130], 'Second,': [131], 'number': [135], 'empirical': [137], 'studies': [138], 'both': [139], 'on': [140, 145], 'synthetically': [141], 'generated': [142], 'real': [146], 'sets': [148], 'applications': [150], 'color': [152], 'quantization,': [153], 'compression,': [155], 'image': [157], 'segmentation.': [158]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W2161160262', 'counts_by_year': [{'year': 2024, 'cited_by_count': 229}, {'year': 2023, 'cited_by_count': 295}, {'year': 2022, 'cited_by_count': 351}, {'year': 2021, 'cited_by_count': 426}, {'year': 2020, 'cited_by_count': 413}, {'year': 2019, 'cited_by_count': 456}, {'year': 2018, 'cited_by_count': 434}, {'year': 2017, 'cited_by_count': 373}, {'year': 2016, 'cited_by_count': 339}, {'year': 2015, 'cited_by_count': 316}, {'year': 2014, 'cited_by_count': 315}, {'year': 2013, 'cited_by_count': 292}, {'year': 2012, 'cited_by_count': 284}], 'updated_date': '2025-01-18T14:24:37.341382', 'created_date': '2016-06-24'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works