Partially observable Markov decision processes with reward information

Xi‐Ren Cao; Xianping Guo
{'id': 'https://openalex.org/W2135915258', 'doi': 'https://doi.org/10.1109/cdc.2004.1429442', 'title': 'Partially observable Markov decision processes with reward information', 'display_name': 'Partially observable Markov decision processes with reward information', 'publication_year': 2004, 'publication_date': '2004-01-01', 'ids': {'openalex': 'https://openalex.org/W2135915258', 'doi': 'https://doi.org/10.1109/cdc.2004.1429442', 'mag': '2135915258'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/cdc.2004.1429442', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S4363608267', 'display_name': '2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601)', 'issn_l': None, 'issn': None, 'is_oa': False, 'is_in_doaj': False, 'is_core': False, 'host_organization': None, 'host_organization_name': None, 'host_organization_lineage': [], 'host_organization_lineage_names': [], 'type': 'conference'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'proceedings-article', 'indexed_in': ['crossref'], 'open_access': {'is_oa': False, 'oa_status': 'closed', 'oa_url': None, 'any_repository_has_fulltext': False}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5100635237', 'display_name': 'Xi‐Ren Cao', 'orcid': 'https://orcid.org/0000-0001-5165-8804'}, 'institutions': [{'id': 'https://openalex.org/I200769079', 'display_name': 'Hong Kong University of Science and Technology', 'ror': 'https://ror.org/00q4vv597', 'country_code': 'CN', 'type': 'education', 'lineage': ['https://openalex.org/I200769079']}], 'countries': ['CN'], 'is_corresponding': False, 'raw_author_name': 'None Xi-Ren Cao', 'raw_affiliation_strings': ['Dept. of Electr. & Electron. Eng. Hong Kong Univ. of Sci. & Technol., China'], 'affiliations': [{'raw_affiliation_string': 'Dept. of Electr. & Electron. Eng. Hong Kong Univ. of Sci. & Technol., China', 'institution_ids': ['https://openalex.org/I200769079']}]}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5101505539', 'display_name': 'Xianping Guo', 'orcid': 'https://orcid.org/0000-0001-6954-5947'}, 'institutions': [], 'countries': [], 'is_corresponding': False, 'raw_author_name': 'None Xianping Guo', 'raw_affiliation_strings': ['Sun Yat-Sen University'], 'affiliations': [{'raw_affiliation_string': 'Sun Yat-Sen University', 'institution_ids': []}]}], 'countries_distinct_count': 1, 'institutions_distinct_count': 1, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 0.245, 'has_fulltext': True, 'fulltext_origin': 'ngrams', 'cited_by_count': 6, 'citation_normalized_percentile': {'value': 0.384585, 'is_in_top_1_percent': False, 'is_in_top_10_percent': False}, 'cited_by_percentile_year': {'min': 78, 'max': 79}, 'biblio': {'volume': None, 'issue': None, 'first_page': None, 'last_page': None}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T10462', 'display_name': 'Reinforcement Learning Algorithms', 'score': 0.9997, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T10462', 'display_name': 'Reinforcement Learning Algorithms', 'score': 0.9997, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10791', 'display_name': 'Model Predictive Control in Industrial Processes', 'score': 0.9807, 'subfield': {'id': 'https://openalex.org/subfields/2207', 'display_name': 'Control and Systems Engineering'}, 'field': {'id': 'https://openalex.org/fields/22', 'display_name': 'Engineering'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10142', 'display_name': 'Formal Methods in Software Verification and Control', 'score': 0.9774, 'subfield': {'id': 'https://openalex.org/subfields/1703', 'display_name': 'Computational Theory and Mathematics'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/probabilistic-systems', 'display_name': 'Probabilistic Systems', 'score': 0.499935}], 'concepts': [{'id': 'https://openalex.org/C32848918', 'wikidata': 'https://www.wikidata.org/wiki/Q845789', 'display_name': 'Observable', 'level': 2, 'score': 0.7265605}, {'id': 'https://openalex.org/C17098449', 'wikidata': 'https://www.wikidata.org/wiki/Q176814', 'display_name': 'Partially observable Markov decision process', 'level': 4, 'score': 0.70759827}, {'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.63940054}, {'id': 'https://openalex.org/C159886148', 'wikidata': 'https://www.wikidata.org/wiki/Q176645', 'display_name': 'Markov process', 'level': 2, 'score': 0.5691388}, {'id': 'https://openalex.org/C106189395', 'wikidata': 'https://www.wikidata.org/wiki/Q176789', 'display_name': 'Markov decision process', 'level': 3, 'score': 0.54164207}, {'id': 'https://openalex.org/C98763669', 'wikidata': 'https://www.wikidata.org/wiki/Q176645', 'display_name': 'Markov chain', 'level': 2, 'score': 0.44601515}, {'id': 'https://openalex.org/C154945302', 'wikidata': 'https://www.wikidata.org/wiki/Q11660', 'display_name': 'Artificial intelligence', 'level': 1, 'score': 0.43371278}, {'id': 'https://openalex.org/C163836022', 'wikidata': 'https://www.wikidata.org/wiki/Q6771326', 'display_name': 'Markov model', 'level': 3, 'score': 0.38922524}, {'id': 'https://openalex.org/C119857082', 'wikidata': 'https://www.wikidata.org/wiki/Q2539', 'display_name': 'Machine learning', 'level': 1, 'score': 0.3056854}, {'id': 'https://openalex.org/C105795698', 'wikidata': 'https://www.wikidata.org/wiki/Q12483', 'display_name': 'Statistics', 'level': 1, 'score': 0.1782513}, {'id': 'https://openalex.org/C33923547', 'wikidata': 'https://www.wikidata.org/wiki/Q395', 'display_name': 'Mathematics', 'level': 0, 'score': 0.17728806}, {'id': 'https://openalex.org/C121332964', 'wikidata': 'https://www.wikidata.org/wiki/Q413', 'display_name': 'Physics', 'level': 0, 'score': 0.06932497}, {'id': 'https://openalex.org/C62520636', 'wikidata': 'https://www.wikidata.org/wiki/Q944', 'display_name': 'Quantum mechanics', 'level': 1, 'score': 0.0}], 'mesh': [], 'locations_count': 1, 'locations': [{'is_oa': False, 'landing_page_url': 'https://doi.org/10.1109/cdc.2004.1429442', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S4363608267', 'display_name': '2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601)', 'issn_l': None, 'issn': None, 'is_oa': False, 'is_in_doaj': False, 'is_core': False, 'host_organization': None, 'host_organization_name': None, 'host_organization_lineage': [], 'host_organization_lineage_names': [], 'type': 'conference'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}], 'best_oa_location': None, 'sustainable_development_goals': [{'score': 0.77, 'display_name': 'Peace, justice, and strong institutions', 'id': 'https://metadata.un.org/sdg/16'}], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 28, 'referenced_works': ['https://openalex.org/W1531253382', 'https://openalex.org/W1563317173', 'https://openalex.org/W1967037758', 'https://openalex.org/W1972176367', 'https://openalex.org/W2010654234', 'https://openalex.org/W2028586524', 'https://openalex.org/W2029561387', 'https://openalex.org/W2040434095', 'https://openalex.org/W2043378258', 'https://openalex.org/W2044549125', 'https://openalex.org/W2059608296', 'https://openalex.org/W2065087844', 'https://openalex.org/W2068782949', 'https://openalex.org/W2069065005', 'https://openalex.org/W2071370395', 'https://openalex.org/W2075379212', 'https://openalex.org/W2095267417', 'https://openalex.org/W2107338824', 'https://openalex.org/W2115308494', 'https://openalex.org/W2119567691', 'https://openalex.org/W2175602944', 'https://openalex.org/W2334404018', 'https://openalex.org/W2334732686', 'https://openalex.org/W2334782222', 'https://openalex.org/W41599142', 'https://openalex.org/W4210545817', 'https://openalex.org/W4242780150', 'https://openalex.org/W4245744559'], 'related_works': ['https://openalex.org/W52153049', 'https://openalex.org/W4323315247', 'https://openalex.org/W2999848267', 'https://openalex.org/W2951545791', 'https://openalex.org/W2294884454', 'https://openalex.org/W2096013579', 'https://openalex.org/W1760611253', 'https://openalex.org/W1589140671', 'https://openalex.org/W1515117609', 'https://openalex.org/W131709709'], 'abstract_inverted_index': {'In': [0], 'a': [1, 55], 'partially': [2, 86], 'observable': [3, 87], 'Markov': [4, 88], 'decision': [5, 89], 'process': [6], '(POMDP),': [7], 'if': [8], 'the': [9, 18, 25, 33, 37, 45, 74, 95, 102], 'reward': [10, 20, 96, 103], 'can': [11, 40], 'be': [12, 41], 'observed': [13, 19], 'at': [14, 104], 'each': [15, 105], 'step,': [16], 'then': [17], 'history': [21], 'contains': [22], 'information': [23, 34], 'for': [24, 85], 'unknown': [26], 'state.': [27], 'This': [28], 'information,': [29], 'in': [30, 36], 'addition': [31], 'to': [32, 43, 81], 'contained': [35], 'observation': [38, 75, 79], 'history,': [39], 'used': [42], 'update': [44], 'state': [46], 'probability': [47], 'distribution.': [48], 'The': [49, 77], 'policy': [50, 57, 62, 70], 'thus': [51], 'obtained': [52], 'is': [53, 98, 107], 'called': [54], 'reward-information': [56], '(RI-policy);': [58], 'an': [59], 'optimal': [60, 69], 'RI': [61], 'performs': [63], 'no': [64], 'worse': [65], 'than': [66], 'any': [67], 'normal': [68], 'depending': [71, 92], 'only': [72], 'on': [73, 93], 'history.': [76], 'above': [78], 'leads': [80], 'four': [82], 'different': [83], 'problem-formulations': [84], 'processes': [90], '(POMDPs)': [91], 'whether': [94, 101], 'function': [97], 'known': [99], 'and': [100], 'step': [106], 'observable.': [108]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W2135915258', 'counts_by_year': [{'year': 2014, 'cited_by_count': 1}], 'updated_date': '2024-08-31T23:56:04.553339', 'created_date': '2016-06-24'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works