Solving H-horizon, stationary Markov decision problems in time proportional to log(H)

Paul Tseng
{'id': 'https://openalex.org/W2155640779', 'doi': 'https://doi.org/10.1016/0167-6377(90)90022-w', 'title': 'Solving H-horizon, stationary Markov decision problems in time proportional to log(H)', 'display_name': 'Solving H-horizon, stationary Markov decision problems in time proportional to log(H)', 'publication_year': 1990, 'publication_date': '1990-09-01', 'ids': {'openalex': 'https://openalex.org/W2155640779', 'doi': 'https://doi.org/10.1016/0167-6377(90)90022-w', 'mag': '2155640779'}, 'language': 'en', 'primary_location': {'is_oa': False, 'landing_page_url': 'https://doi.org/10.1016/0167-6377(90)90022-w', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S27769002', 'display_name': 'Operations Research Letters', 'issn_l': '0167-6377', 'issn': ['0167-6377', '1872-7468'], 'is_oa': False, 'is_in_doaj': False, 'is_core': True, 'host_organization': 'https://openalex.org/P4310320990', 'host_organization_name': 'Elsevier BV', 'host_organization_lineage': ['https://openalex.org/P4310320990'], 'host_organization_lineage_names': ['Elsevier BV'], 'type': 'journal'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, 'type': 'article', 'type_crossref': 'journal-article', 'indexed_in': ['crossref'], 'open_access': {'is_oa': True, 'oa_status': 'green', 'oa_url': 'http://dspace.mit.edu/bitstream/1721.1/3070/1/P-1793-19477231.pdf', 'any_repository_has_fulltext': True}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5031021526', 'display_name': 'Paul Tseng', 'orcid': None}, 'institutions': [], 'countries': ['US'], 'is_corresponding': True, 'raw_author_name': 'Paul Tseng', 'raw_affiliation_strings': ['Laboratory for Information and Decision Systems, Massachussetts Institute of Technology, Cambridge, MA 02139, USA'], 'affiliations': [{'raw_affiliation_string': 'Laboratory for Information and Decision Systems, Massachussetts Institute of Technology, Cambridge, MA 02139, USA', 'institution_ids': []}]}], 'institution_assertions': [], 'countries_distinct_count': 1, 'institutions_distinct_count': 0, 'corresponding_author_ids': ['https://openalex.org/A5031021526'], 'corresponding_institution_ids': [], 'apc_list': {'value': 2760, 'currency': 'USD', 'value_usd': 2760, 'provenance': 'doaj'}, 'apc_paid': None, 'fwci': 1.127, 'has_fulltext': True, 'fulltext_origin': 'ngrams', 'cited_by_count': 78, 'citation_normalized_percentile': {'value': 0.966056, 'is_in_top_1_percent': False, 'is_in_top_10_percent': True}, 'cited_by_percentile_year': {'min': 95, 'max': 96}, 'biblio': {'volume': '9', 'issue': '5', 'first_page': '287', 'last_page': '297'}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T12288', 'display_name': 'Distributed Coordination in Online Robotics Research', 'score': 0.9996, 'subfield': {'id': 'https://openalex.org/subfields/1705', 'display_name': 'Computer Networks and Communications'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T12288', 'display_name': 'Distributed Coordination in Online Robotics Research', 'score': 0.9996, 'subfield': {'id': 'https://openalex.org/subfields/1705', 'display_name': 'Computer Networks and Communications'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T12056', 'display_name': 'Bayesian Monte Carlo Methods in Scientific Inference', 'score': 0.9996, 'subfield': {'id': 'https://openalex.org/subfields/2613', 'display_name': 'Statistics and Probability'}, 'field': {'id': 'https://openalex.org/fields/26', 'display_name': 'Mathematics'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T12101', 'display_name': 'Optimization of Multi-Armed Bandit Problems', 'score': 0.999, 'subfield': {'id': 'https://openalex.org/subfields/1803', 'display_name': 'Management Science and Operations Research'}, 'field': {'id': 'https://openalex.org/fields/18', 'display_name': 'Decision Sciences'}, 'domain': {'id': 'https://openalex.org/domains/2', 'display_name': 'Social Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/approximation-algorithms', 'display_name': 'Approximation Algorithms', 'score': 0.571891}, {'id': 'https://openalex.org/keywords/approximate-bayesian-computation', 'display_name': 'Approximate Bayesian Computation', 'score': 0.524625}, {'id': 'https://openalex.org/keywords/convex-optimization', 'display_name': 'Convex Optimization', 'score': 0.515208}, {'id': 'https://openalex.org/keywords/hyperparameter-optimization', 'display_name': 'Hyperparameter Optimization', 'score': 0.509603}, {'id': 'https://openalex.org/keywords/time-horizon', 'display_name': 'Time horizon', 'score': 0.5033235}, {'id': 'https://openalex.org/keywords/online-algorithms', 'display_name': 'Online Algorithms', 'score': 0.502968}], 'concepts': [{'id': 'https://openalex.org/C33923547', 'wikidata': 'https://www.wikidata.org/wiki/Q395', 'display_name': 'Mathematics', 'level': 0, 'score': 0.7807554}, {'id': 'https://openalex.org/C34388435', 'wikidata': 'https://www.wikidata.org/wiki/Q2267362', 'display_name': 'Bounded function', 'level': 2, 'score': 0.64677835}, {'id': 'https://openalex.org/C106189395', 'wikidata': 'https://www.wikidata.org/wiki/Q176789', 'display_name': 'Markov decision process', 'level': 3, 'score': 0.58889556}, {'id': 'https://openalex.org/C98763669', 'wikidata': 'https://www.wikidata.org/wiki/Q176645', 'display_name': 'Markov chain', 'level': 2, 'score': 0.56922466}, {'id': 'https://openalex.org/C90119067', 'wikidata': 'https://www.wikidata.org/wiki/Q43260', 'display_name': 'Polynomial', 'level': 2, 'score': 0.51164323}, {'id': 'https://openalex.org/C191795146', 'wikidata': 'https://www.wikidata.org/wiki/Q3878446', 'display_name': 'Norm (philosophy)', 'level': 2, 'score': 0.50335854}, {'id': 'https://openalex.org/C28761237', 'wikidata': 'https://www.wikidata.org/wiki/Q7805321', 'display_name': 'Time horizon', 'level': 2, 'score': 0.5033235}, {'id': 'https://openalex.org/C28826006', 'wikidata': 'https://www.wikidata.org/wiki/Q33521', 'display_name': 'Applied mathematics', 'level': 1, 'score': 0.48381037}, {'id': 'https://openalex.org/C311688', 'wikidata': 'https://www.wikidata.org/wiki/Q2393193', 'display_name': 'Time complexity', 'level': 2, 'score': 0.45664167}, {'id': 'https://openalex.org/C159886148', 'wikidata': 'https://www.wikidata.org/wiki/Q176645', 'display_name': 'Markov process', 'level': 2, 'score': 0.42035443}, {'id': 'https://openalex.org/C126255220', 'wikidata': 'https://www.wikidata.org/wiki/Q141495', 'display_name': 'Mathematical optimization', 'level': 1, 'score': 0.3799071}, {'id': 'https://openalex.org/C114614502', 'wikidata': 'https://www.wikidata.org/wiki/Q76592', 'display_name': 'Combinatorics', 'level': 1, 'score': 0.35585427}, {'id': 'https://openalex.org/C134306372', 'wikidata': 'https://www.wikidata.org/wiki/Q7754', 'display_name': 'Mathematical analysis', 'level': 1, 'score': 0.18772545}, {'id': 'https://openalex.org/C105795698', 'wikidata': 'https://www.wikidata.org/wiki/Q12483', 'display_name': 'Statistics', 'level': 1, 'score': 0.15207955}, {'id': 'https://openalex.org/C17744445', 'wikidata': 'https://www.wikidata.org/wiki/Q36442', 'display_name': 'Political science', 'level': 0, 'score': 0.0}, {'id': 'https://openalex.org/C199539241', 'wikidata': 'https://www.wikidata.org/wiki/Q7748', 'display_name': 'Law', 'level': 1, 'score': 0.0}], 'mesh': [], 'locations_count': 2, 'locations': [{'is_oa': False, 'landing_page_url': 'https://doi.org/10.1016/0167-6377(90)90022-w', 'pdf_url': None, 'source': {'id': 'https://openalex.org/S27769002', 'display_name': 'Operations Research Letters', 'issn_l': '0167-6377', 'issn': ['0167-6377', '1872-7468'], 'is_oa': False, 'is_in_doaj': False, 'is_core': True, 'host_organization': 'https://openalex.org/P4310320990', 'host_organization_name': 'Elsevier BV', 'host_organization_lineage': ['https://openalex.org/P4310320990'], 'host_organization_lineage_names': ['Elsevier BV'], 'type': 'journal'}, 'license': None, 'license_id': None, 'version': None, 'is_accepted': False, 'is_published': False}, {'is_oa': True, 'landing_page_url': 'http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.89.5394', 'pdf_url': 'http://dspace.mit.edu/bitstream/1721.1/3070/1/P-1793-19477231.pdf', 'source': {'id': 'https://openalex.org/S4306400349', 'display_name': 'CiteSeer X (The Pennsylvania State University)', 'issn_l': None, 'issn': None, 'is_oa': True, 'is_in_doaj': False, 'is_core': False, 'host_organization': 'https://openalex.org/I130769515', 'host_organization_name': 'Pennsylvania State University', 'host_organization_lineage': ['https://openalex.org/I130769515'], 'host_organization_lineage_names': ['Pennsylvania State University'], 'type': 'repository'}, 'license': None, 'license_id': None, 'version': 'submittedVersion', 'is_accepted': False, 'is_published': False}], 'best_oa_location': {'is_oa': True, 'landing_page_url': 'http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.89.5394', 'pdf_url': 'http://dspace.mit.edu/bitstream/1721.1/3070/1/P-1793-19477231.pdf', 'source': {'id': 'https://openalex.org/S4306400349', 'display_name': 'CiteSeer X (The Pennsylvania State University)', 'issn_l': None, 'issn': None, 'is_oa': True, 'is_in_doaj': False, 'is_core': False, 'host_organization': 'https://openalex.org/I130769515', 'host_organization_name': 'Pennsylvania State University', 'host_organization_lineage': ['https://openalex.org/I130769515'], 'host_organization_lineage_names': ['Pennsylvania State University'], 'type': 'repository'}, 'license': None, 'license_id': None, 'version': 'submittedVersion', 'is_accepted': False, 'is_published': False}, 'sustainable_development_goals': [{'id': 'https://metadata.un.org/sdg/16', 'display_name': 'Peace, justice, and strong institutions', 'score': 0.81}], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 20, 'referenced_works': ['https://openalex.org/W1505491724', 'https://openalex.org/W1545148916', 'https://openalex.org/W1603765807', 'https://openalex.org/W1709621369', 'https://openalex.org/W2001015965', 'https://openalex.org/W2011039300', 'https://openalex.org/W2019851160', 'https://openalex.org/W2028145673', 'https://openalex.org/W2032100464', 'https://openalex.org/W2053913299', 'https://openalex.org/W2101196233', 'https://openalex.org/W2113147495', 'https://openalex.org/W23084172', 'https://openalex.org/W2312502266', 'https://openalex.org/W2321292752', 'https://openalex.org/W2331953749', 'https://openalex.org/W2596951540', 'https://openalex.org/W2611147814', 'https://openalex.org/W4247480955', 'https://openalex.org/W4285719527'], 'related_works': ['https://openalex.org/W4255368532', 'https://openalex.org/W4213214852', 'https://openalex.org/W3087810330', 'https://openalex.org/W2904076065', 'https://openalex.org/W2512014291', 'https://openalex.org/W2465145931', 'https://openalex.org/W2162286586', 'https://openalex.org/W2120406836', 'https://openalex.org/W187740018', 'https://openalex.org/W1589140671'], 'abstract_inverted_index': {'We': [0], 'consider': [1], 'the': [2, 9, 50, 68], 'H-horizon,': [3], 'stationary': [4, 72], 'Markov': [5], 'decision': [6], 'problem.': [7], 'For': [8, 28, 49], 'discounted': [10], 'case,': [11, 52], 'we': [12, 37, 63], 'give': [13], 'an': [14], 'ε-approximation': [15], 'algorithm': [16], 'whose': [17], 'time': [18], 'is': [19, 32], 'proportional': [20], 'to': [21], 'log(1/ε),': [22], 'log(H)': [23], 'and': [24, 45], '1(1': [25], '−': [26], 'α).': [27], 'problems': [29], 'where': [30], 'α': [31], 'bounded': [33], 'away': [34], 'from': [35], '1,': [36], 'obtain,': [38], 'respectively,': [39], 'a': [40, 46, 55], 'fully': [41], 'polynomial': [42], 'approximation': [43], 'scheme': [44], 'polynomial-time': [47], 'algorithm.': [48], 'undiscounted': [51], 'by': [53], 'refining': [54], 'weighted': [56], 'maximum': [57], 'norm': [58], 'contraction': [59], 'result': [60], 'of': [61], 'Hoffman,': [62], 'derive': [64], 'analogous': [65], 'results': [66], 'under': [67], 'assumption': [69], 'that': [70], 'all': [71], 'policies': [73], 'are': [74], 'proper.': [75]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W2155640779', 'counts_by_year': [{'year': 2022, 'cited_by_count': 2}, {'year': 2021, 'cited_by_count': 7}, {'year': 2020, 'cited_by_count': 5}, {'year': 2019, 'cited_by_count': 5}, {'year': 2018, 'cited_by_count': 5}, {'year': 2017, 'cited_by_count': 5}, {'year': 2016, 'cited_by_count': 3}, {'year': 2014, 'cited_by_count': 5}, {'year': 2013, 'cited_by_count': 6}, {'year': 2012, 'cited_by_count': 5}], 'updated_date': '2024-09-18T17:23:56.705759', 'created_date': '2016-06-24'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works