A reinforcement learning approach to cooperative problem solving

T. Yoshida; Hirokazu Hori; Shinichi Nakasuka
{'id': 'https://openalex.org/W1809265768', 'doi': 'https://doi.org/10.1109/icmas.1998.699295', 'title': 'A reinforcement learning approach to cooperative problem solving', 'display_name': 'A reinforcement learning approach to cooperative problem solving', 'publication_year': 2002, 'publication_date': '2002-11-27', 'ids': {'openalex': 'https://openalex.org/W1809265768', 'doi': 'https://doi.org/10.1109/icmas.1998.699295', 'mag': '1809265768'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/icmas.1998.699295', 'pdf_url': None, 'source': None, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'proceedings-article', 'indexed_in': ['crossref'], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5002191265', 'display_name': 'T. Yoshida', 'orcid': 'https://orcid.org/0000-0003-0553-0150'}, 'institutions': [{'id': 'https://openalex.org/I98285908', 'display_name': 'Osaka University', 'ror': 'https://ror.org/035t8zc32', 'country_code': 'JP', 'type': 'education', 'lineage': ['https://openalex.org/I98285908']}], 'countries': ['JP'], 'is_corresponding': False, 'raw_author_name': 'T. Yoshida', 'raw_affiliation_strings': ['Graduate School of Engineering Science, Osaka University , Japan'], 'affiliations': [{'raw_affiliation_string': 'Graduate School of Engineering Science, Osaka University , Japan', 'institution_ids': ['https://openalex.org/I98285908']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5047454666', 'display_name': 'Hirokazu Hori', 'orcid': 'https://orcid.org/0000-0002-1843-6801'}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'H. Hori', 'raw_affiliation_strings': [], 'affiliations': []}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5022289022', 'display_name': 'Shinichi Nakasuka', 'orcid': None}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'S. Nakasuka', 'raw_affiliation_strings': [], 'affiliations': []}], 'institution_assertions': [], 'countries_distinct_count': 1, 'institutions_distinct_count': 1, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 0.0, 'has_fulltext': True, 'fulltext_origin': 'ngrams', 'cited_by_count': 1, 'citation_normalized_percentile': {'value': 0.292143, 'is_in_top_1_percent': False, 'is_in_top_10_percent': False}, 'cited_by_percentile_year': {'min': 57, 'max': 64}, 'biblio': {'volume': None, 'issue': None, 'first_page': None, 'last_page': None}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T11975', 'display_name': 'Application of Genetic Programming in Machine Learning', 'score': 0.9899, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T11975', 'display_name': 'Application of Genetic Programming in Machine Learning', 'score': 0.9899, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10462', 'display_name': 'Reinforcement Learning Algorithms', 'score': 0.9841, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10456', 'display_name': 'Methods and Techniques for Agent-Based Modeling', 'score': 0.9765, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/heuristics', 'display_name': 'Heuristics', 'score': 0.7421081}, {'id': 'https://openalex.org/keywords/reinforcement-learning', 'display_name': 'Reinforcement Learning', 'score': 0.621169}, {'id': 'https://openalex.org/keywords/agent-based-modeling', 'display_name': 'Agent-Based Modeling', 'score': 0.557723}, {'id': 'https://openalex.org/keywords/multi-agent-systems', 'display_name': 'Multi-Agent Systems', 'score': 0.548511}, {'id': 'https://openalex.org/keywords/dialectical-argumentation', 'display_name': 'Dialectical Argumentation', 'score': 0.513727}, {'id': 'https://openalex.org/keywords/simulations', 'display_name': 'Simulations', 'score': 0.502733}, {'id': 'https://openalex.org/keywords/temporal-difference-learning', 'display_name': 'Temporal difference learning', 'score': 0.47298014}], 'concepts': [{'id': 'https://openalex.org/C97541855', 'wikidata': 'https://www.wikidata.org/wiki/Q830687', 'display_name': 'Reinforcement learning', 'level': 2, 'score': 0.87419224}, {'id': 'https://openalex.org/C127705205', 'wikidata': 'https://www.wikidata.org/wiki/Q5748245', 'display_name': 'Heuristics', 'level': 2, 'score': 0.7421081}, {'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.7409483}, {'id': 'https://openalex.org/C165696696', 'wikidata': 'https://www.wikidata.org/wiki/Q11287', 'display_name': 'Exploit', 'level': 2, 'score': 0.5961914}, {'id': 'https://openalex.org/C173801870', 'wikidata': 'https://www.wikidata.org/wiki/Q201413', 'display_name': 'Heuristic', 'level': 2, 'score': 0.5512621}, {'id': 'https://openalex.org/C154945302', 'wikidata': 'https://www.wikidata.org/wiki/Q11660', 'display_name': 'Artificial intelligence', 'level': 1, 'score': 0.5127009}, {'id': 'https://openalex.org/C2781181686', 'wikidata': 'https://www.wikidata.org/wiki/Q4226068', 'display_name': 'Coherence (philosophical gambling strategy)', 'level': 2, 'score': 0.498533}, {'id': 'https://openalex.org/C196340769', 'wikidata': 'https://www.wikidata.org/wiki/Q7698910', 'display_name': 'Temporal difference learning', 'level': 3, 'score': 0.47298014}, {'id': 'https://openalex.org/C36503486', 'wikidata': 'https://www.wikidata.org/wiki/Q11235244', 'display_name': 'Domain (mathematical analysis)', 'level': 2, 'score': 0.4513147}, {'id': 'https://openalex.org/C41550386', 'wikidata': 'https://www.wikidata.org/wiki/Q529909', 'display_name': 'Multi-agent system', 'level': 2, 'score': 0.44334412}, {'id': 'https://openalex.org/C126255220', 'wikidata': 'https://www.wikidata.org/wiki/Q141495', 'display_name': 'Mathematical optimization', 'level': 1, 'score': 0.34438014}, {'id': 'https://openalex.org/C119857082', 'wikidata': 'https://www.wikidata.org/wiki/Q2539', 'display_name': 'Machine learning', 'level': 1, 'score': 0.3425125}, {'id': 'https://openalex.org/C33923547', 'wikidata': 'https://www.wikidata.org/wiki/Q395', 'display_name': 'Mathematics', 'level': 0, 'score': 0.11292437}, {'id': 'https://openalex.org/C38652104', 'wikidata': 'https://www.wikidata.org/wiki/Q3510521', 'display_name': 'Computer security', 'level': 1, 'score': 0.072319835}, {'id': 'https://openalex.org/C134306372', 'wikidata': 'https://www.wikidata.org/wiki/Q7754', 'display_name': 'Mathematical analysis', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C105795698', 'wikidata': 'https://www.wikidata.org/wiki/Q12483', 'display_name': 'Statistics', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C111919701', 'wikidata': 'https://www.wikidata.org/wiki/Q9135', 'display_name': 'Operating system', 'level': 1, 'score': 0.0}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/icmas.1998.699295', 'pdf_url': None, 'source': None, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [{'score': 0.52, 'display_name': 'Partnerships for the goals', 'id': 'https://metadata.un.org/sdg/17'}], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 3, 'referenced_works': ['https://openalex.org/W2083522202', 'https://openalex.org/W2100677568', 'https://openalex.org/W3041202696'], 'related_works': ['https://openalex.org/W605348272', 'https://openalex.org/W3089868768', 'https://openalex.org/W2963355111', 'https://openalex.org/W2803408293', 'https://openalex.org/W2798587560', 'https://openalex.org/W2373283138', 'https://openalex.org/W2163808368', 'https://openalex.org/W2151340488', 'https://openalex.org/W2112650165', 'https://openalex.org/W1809265768'], 'abstract_inverted_index': {'We': [0, 80], 'propose': [1], 'an': [2], 'extension': [3], 'of': [4, 49, 142, 147], 'reinforcement': [5, 154], 'learning': [6, 25], 'methods': [7], 'to': [8, 44, 71, 99, 101, 106, 117, 132], 'cooperative': [9], 'problem': [10, 78, 127], 'solving': [11], 'in': [12, 41, 126, 153], 'multi': [13, 148], 'agent': [14, 43, 149], 'systems.': [15], 'Exploiting': [16], 'multiple': [17], 'agents': [18, 75, 98, 114], 'for': [19, 54, 63, 76], 'complex': [20], 'problems': [21], 'is': [22, 26, 32, 39, 130], 'promising,': [23], 'however,': [24], 'necessary': [27], 'since': [28], 'complete': [29], 'domain': [30], 'knowledge': [31], 'rarely': [33], 'available.': [34], 'The': [35, 91], 'temporal': [36], 'difference': [37], 'algorithm': [38], 'applied': [40], 'each': [42], 'learn': [45, 100, 107, 116], 'a': [46, 66], 'heuristic': [47], 'evaluation': [48], 'states.': [50], 'Besides': [51], 'the': [52, 61, 82, 87, 119, 135, 140, 144], 'reward': [53, 62], 'solutions': [55], 'produced': [56], 'by': [57, 84], 'agents,': [58], 'we': [59], 'define': [60], 'coherence': [64], 'as': [65, 103, 105], 'whole': [67], 'and': [68, 124], 'exploit': [69], 'them': [70], 'facilitate': [72], 'cooperation': [73], 'among': [74], 'global': [77, 145], 'solving.': [79], 'evaluate': [81], 'method': [83, 96], 'experiments': [85], 'on': [86], 'satellite': [88], 'design': [89], 'problem.': [90], 'result': [92], 'shows': [93], 'that': [94], 'our': [95], 'enables': [97], 'cooperate': [102], 'well': [104], 'individual': [108], 'heuristics': [109], 'within': [110], 'one': [111], 'framework.': [112], 'Especially,': [113], 'themselves': [115], 'take': [118], 'appropriate': [120], 'balance': [121], 'between': [122], 'exploration': [123], 'exploitation': [125], 'solving,': [128], 'which': [129], 'known': [131], 'greatly': [133], 'affect': [134], 'performance.': [136], 'It': [137], 'also': [138], 'suggests': [139], 'possibility': [141], 'controlling': [143], 'behavior': [146], 'systems': [150], 'via': [151], 'rewards': [152], 'learning.': [155]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W1809265768', 'counts_by_year': [{'year': 2022, 'cited_by_count': 1}], 'updated_date': '2024-09-24T10:12:17.206451', 'created_date': '2016-06-24'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works