SEED SELECTION BASED WEB CRAWLER FOR WEB PAGE CLASSIFICATION: A SURVEY

Name: Work Video:
Duration: 3 min 30 s
Vikash Kumar; Yogadhar Pandey
{'id': 'https://openalex.org/W3201526811', 'doi': 'https://doi.org/10.17605/osf.io/2nm7r', 'title': 'SEED SELECTION BASED WEB CRAWLER FOR WEB PAGE CLASSIFICATION: A SURVEY', 'display_name': 'SEED SELECTION BASED WEB CRAWLER FOR WEB PAGE CLASSIFICATION: A SURVEY', 'publication_year': 2021, 'publication_date': '2021-09-18', 'ids': {'openalex': 'https://openalex.org/W3201526811', 'doi': 'https://doi.org/10.17605/osf.io/2nm7r', 'mag': '3201526811'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://repo.ijiert.org/index.php/ijiert/article/view/2893', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S4306513986', 'display_name': 'International Journal of Innovations in Engineering Research and Technology', 'issn_l': None, 'issn': None, 'is_oa': False, 'is_in_doaj': False, 'is_core': False, 'host_organization': None, 'host_organization_name': None, 'host_organization_lineage': [], 'host_organization_lineage_names': [], 'type': 'journal'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'journal-article', 'indexed_in': [], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5110925459', 'display_name': 'Vikash Kumar', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I4210162153', 'display_name': 'Madhya Pradesh Council of Science and Technology', 'ror': 'https://ror.org/04xtbyj71', 'country_code': 'IN', 'type': 'government', 'lineage': ['https://openalex.org/I4210162153']}], 'countries': ['IN'], 'is_corresponding': False, 'raw_author_name': 'Vikash Kumar', 'raw_affiliation_strings': ['M. Tech Scholar, Computer Science & Engineering, Technocrats Institute of Technology, Bhopal (M.P.), India'], 'affiliations': [{'raw_affiliation_string': 'M. Tech Scholar, Computer Science & Engineering, Technocrats Institute of Technology, Bhopal (M.P.), India', 'institution_ids': ['https://openalex.org/I4210162153']}]}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5034958783', 'display_name': 'Yogadhar Pandey', 'orcid': 'https://orcid.org/0000-0003-2770-8845'}, 'institutions': [], 'countries': ['IN'], 'is_corresponding': False, 'raw_author_name': 'Yogadhar Pandey', 'raw_affiliation_strings': ['Professor, Computer Science & Engineering, Technocrats Institute of Technology, Bhopal (M.P.), India'], 'affiliations': [{'raw_affiliation_string': 'Professor, Computer Science & Engineering, Technocrats Institute of Technology, Bhopal (M.P.), India', 'institution_ids': []}]}], 'institution_assertions': [], 'countries_distinct_count': 1, 'institutions_distinct_count': 1, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 0.0, 'has_fulltext': False, 'cited_by_count': 0, 'citation_normalized_percentile': {'value': 0.0, 'is_in_top_1_percent': False, 'is_in_top_10_percent': False}, 'cited_by_percentile_year': {'min': 0, 'max': 57}, 'biblio': {'volume': '8', 'issue': '09', 'first_page': '140', 'last_page': '149'}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T12016', 'display_name': 'Web Data Mining and Analysis', 'score': 0.9998, 'subfield': {'id': 'https://openalex.org/subfields/1710', 'display_name': 'Information Systems'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T12016', 'display_name': 'Web Data Mining and Analysis', 'score': 0.9998, 'subfield': {'id': 'https://openalex.org/subfields/1710', 'display_name': 'Information Systems'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T11644', 'display_name': 'Spam and Phishing Detection', 'score': 0.9817, 'subfield': {'id': 'https://openalex.org/subfields/1710', 'display_name': 'Information Systems'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T11241', 'display_name': 'Advanced Malware Detection Techniques', 'score': 0.9695, 'subfield': {'id': 'https://openalex.org/subfields/1711', 'display_name': 'Signal Processing'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/web-crawler', 'display_name': 'Web crawler', 'score': 0.9311596}, {'id': 'https://openalex.org/keywords/focused-crawler', 'display_name': 'Focused crawler', 'score': 0.8429018}], 'concepts': [{'id': 'https://openalex.org/C13743948', 'wikidata': 'https://www.wikidata.org/wiki/Q45842', 'display_name': 'Web crawler', 'level': 2, 'score': 0.9311596}, {'id': 'https://openalex.org/C73340581', 'wikidata': 'https://www.wikidata.org/wiki/Q5463958', 'display_name': 'Focused crawler', 'level': 5, 'score': 0.8429018}, {'id': 'https://openalex.org/C136764020', 'wikidata': 'https://www.wikidata.org/wiki/Q466', 'display_name': 'World Wide Web', 'level': 1, 'score': 0.75807345}, {'id': 'https://openalex.org/C21959979', 'wikidata': 'https://www.wikidata.org/wiki/Q36774', 'display_name': 'Web page', 'level': 2, 'score': 0.74955744}, {'id': 'https://openalex.org/C521815418', 'wikidata': 'https://www.wikidata.org/wiki/Q4182287', 'display_name': 'Web search engine', 'level': 4, 'score': 0.70684636}, {'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.6383922}, {'id': 'https://openalex.org/C173576120', 'wikidata': 'https://www.wikidata.org/wiki/Q2641220', 'display_name': 'Static web page', 'level': 4, 'score': 0.60545146}, {'id': 'https://openalex.org/C23123220', 'wikidata': 'https://www.wikidata.org/wiki/Q816826', 'display_name': 'Information retrieval', 'level': 1, 'score': 0.533903}, {'id': 'https://openalex.org/C162005631', 'wikidata': 'https://www.wikidata.org/wiki/Q54837', 'display_name': 'Data Web', 'level': 3, 'score': 0.43876454}, {'id': 'https://openalex.org/C79373723', 'wikidata': 'https://www.wikidata.org/wiki/Q386275', 'display_name': 'Web development', 'level': 3, 'score': 0.38411385}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://repo.ijiert.org/index.php/ijiert/article/view/2893', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S4306513986', 'display_name': 'International Journal of Innovations in Engineering Research and Technology', 'issn_l': None, 'issn': None, 'is_oa': False, 'is_in_doaj': False, 'is_core': False, 'host_organization': None, 'host_organization_name': None, 'host_organization_lineage': [], 'host_organization_lineage_names': [], 'type': 'journal'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 13, 'referenced_works': ['https://openalex.org/W1114377934', 'https://openalex.org/W1999267468', 'https://openalex.org/W2033672567', 'https://openalex.org/W2080979031', 'https://openalex.org/W2130221750', 'https://openalex.org/W2151809999', 'https://openalex.org/W2608427695', 'https://openalex.org/W2621100890', 'https://openalex.org/W2783231802', 'https://openalex.org/W2791710883', 'https://openalex.org/W2808076737', 'https://openalex.org/W2898654349', 'https://openalex.org/W2911388033'], 'related_works': ['https://openalex.org/W757864652', 'https://openalex.org/W3164053708', 'https://openalex.org/W3006145085', 'https://openalex.org/W2970191066', 'https://openalex.org/W2920184028', 'https://openalex.org/W2771401735', 'https://openalex.org/W2382964733', 'https://openalex.org/W2184149344', 'https://openalex.org/W2148406518', 'https://openalex.org/W2146887852', 'https://openalex.org/W2128396626', 'https://openalex.org/W2118015080', 'https://openalex.org/W2103874137', 'https://openalex.org/W2092706362', 'https://openalex.org/W2088345472', 'https://openalex.org/W2054637307', 'https://openalex.org/W2040869915', 'https://openalex.org/W2030625253', 'https://openalex.org/W1963973829', 'https://openalex.org/W1543345473'], 'abstract_inverted_index': {'A': [0], 'web': [1, 33, 51, 65, 83, 93, 97, 114, 128, 134], 'search': [2, 23, 55, 77], 'engine': [3, 56, 78], 'is': [4, 13, 79, 88], 'a': [5, 62, 145], 'three-phase': [6], 'method': [7], 'of': [8, 39, 49, 64, 75, 104], 'which': [9], 'the': [10, 46, 50, 58, 73, 76, 92, 106, 109, 113, 122, 139], 'first': [11], 'phase': [12], 'Web': [14, 16, 25, 149], 'crawling.': [15], 'crawler': [17, 26, 84, 94], 'works': [18], 'to': [19], 'collect': [20], 'data': [21], 'for': [22], 'engines.': [24], 'Collects': [27], 'pages': [28, 36, 66], 'through': [29, 70], 'hyperlinks': [30], 'exist': [31], 'on': [32, 61, 82, 138], 'pages;': [34], 'these': [35], 'are': [37, 68], 'elements': [38], 'Seed': [40, 140], 'URLs': [41, 107], 'set': [42, 63], 'that': [43, 67, 127], 'defined': [44], 'in': [45, 91, 108, 115], 'initial': [47], 'stage': [48], 'crawling': [52, 98, 150], 'process.': [53], 'The': [54], 'performs': [57], 'ranking': [59], 'algorithm': [60], 'covered': [69], 'crawlers.': [71], 'Therefore': [72], 'result': [74], 'typically': [80], 'based': [81], 'coverage.': [85], 'Most': [86], 'research': [87, 153, 158], 'already': [89], 'done': [90], 'area.': [95], 'Different': [96], 'strategies,': [99], 'resulting': [100], 'from': [101], 'different': [102, 116, 123], 'ways': [103], 'ordering': [105], 'frontier,': [110], 'can': [111], 'explore': [112], 'ways.': [117], 'Researchers': [118], 'have': [119], 'also': [120], 'studied': [121], 'issues': [124], 'and': [125, 156], 'challenges': [126], 'crawlers': [129, 135], 'faced.': [130], 'As': [131], 'per': [132], 'literature': [133], 'performance': [136], 'depends': [137], 'URLs.': [141], 'This': [142], 'paper': [143], 'gives': [144], 'bird': [146], 'eye': [147], 'over': [148, 154], 'methods,': [151], 'recent': [152], 'it': [155], 'possible': [157], 'gap.': [159]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W3201526811', 'counts_by_year': [], 'updated_date': '2024-12-06T00:32:16.514640', 'created_date': '2021-09-27'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works