ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback

Jiacheng Ye; Jiahui Gao; Zhiyong Wu; Jiangtao Feng; Changyuan Yu; Lingpeng Kong
{'id': 'https://openalex.org/W4385567101', 'doi': 'https://doi.org/10.18653/v1/2022.findings-emnlp.269', 'title': 'ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback', 'display_name': 'ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback', 'publication_year': 2022, 'publication_date': '2022-01-01', 'ids': {'openalex': 'https://openalex.org/W4385567101', 'doi': 'https://doi.org/10.18653/v1/2022.findings-emnlp.269'}, 'language': 'en', 'primary_location': {'is_oa': True, 'landing_page_url': 'https://doi.org/10.18653/v1/2022.findings-emnlp.269', 'pdf_url': 'https://aclanthology.org/2022.findings-emnlp.269.pdf', 'source': None, 'license': 'cc-by', 'license_id': 'https://openalex.org/licenses/cc-by', 'version': 'publishedVersion', 'is_accepted': True, 'is_published': True}, 'type': 'article', 'type_crossref': 'proceedings-article', 'indexed_in': ['crossref'], 'open_access': {'is_oa': True, 'oa_status': 'hybrid', 'oa_url': 'https://aclanthology.org/2022.findings-emnlp.269.pdf', 'any_repository_has_fulltext': True}, 'authorships': [{'author_position': 'first', 'author': {'id': 'https://openalex.org/A5102844797', 'display_name': 'Jiacheng Ye', 'orcid': 'https://orcid.org/0009-0008-6306-311X'}, 'institutions': [{'id': 'https://openalex.org/I201448701', 'display_name': 'University of Washington', 'ror': 'https://ror.org/00cvxb145', 'country_code': 'US', 'type': 'education', 'lineage': ['https://openalex.org/I201448701']}, {'id': 'https://openalex.org/I889458895', 'display_name': 'University of Hong Kong', 'ror': 'https://ror.org/02zhqgq86', 'country_code': 'HK', 'type': 'education', 'lineage': ['https://openalex.org/I889458895']}], 'countries': ['HK', 'US'], 'is_corresponding': False, 'raw_author_name': 'Jiacheng Ye', 'raw_affiliation_strings': ['Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong'], 'affiliations': [{'raw_affiliation_string': 'Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong', 'institution_ids': ['https://openalex.org/I201448701', 'https://openalex.org/I889458895']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5067047605', 'display_name': 'Jiahui Gao', 'orcid': 'https://orcid.org/0000-0002-4244-174X'}, 'institutions': [{'id': 'https://openalex.org/I201448701', 'display_name': 'University of Washington', 'ror': 'https://ror.org/00cvxb145', 'country_code': 'US', 'type': 'education', 'lineage': ['https://openalex.org/I201448701']}, {'id': 'https://openalex.org/I889458895', 'display_name': 'University of Hong Kong', 'ror': 'https://ror.org/02zhqgq86', 'country_code': 'HK', 'type': 'education', 'lineage': ['https://openalex.org/I889458895']}], 'countries': ['HK', 'US'], 'is_corresponding': False, 'raw_author_name': 'Jiahui Gao', 'raw_affiliation_strings': ['Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong'], 'affiliations': [{'raw_affiliation_string': 'Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong', 'institution_ids': ['https://openalex.org/I201448701', 'https://openalex.org/I889458895']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5100667025', 'display_name': 'Zhiyong Wu', 'orcid': 'https://orcid.org/0000-0002-6527-5502'}, 'institutions': [{'id': 'https://openalex.org/I201448701', 'display_name': 'University of Washington', 'ror': 'https://ror.org/00cvxb145', 'country_code': 'US', 'type': 'education', 'lineage': ['https://openalex.org/I201448701']}, {'id': 'https://openalex.org/I889458895', 'display_name': 'University of Hong Kong', 'ror': 'https://ror.org/02zhqgq86', 'country_code': 'HK', 'type': 'education', 'lineage': ['https://openalex.org/I889458895']}], 'countries': ['HK', 'US'], 'is_corresponding': False, 'raw_author_name': 'Zhiyong Wu', 'raw_affiliation_strings': ['Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong'], 'affiliations': [{'raw_affiliation_string': 'Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong', 'institution_ids': ['https://openalex.org/I201448701', 'https://openalex.org/I889458895']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5100524347', 'display_name': 'Jiangtao Feng', 'orcid': None}, 'institutions': [{'id': 'https://openalex.org/I201448701', 'display_name': 'University of Washington', 'ror': 'https://ror.org/00cvxb145', 'country_code': 'US', 'type': 'education', 'lineage': ['https://openalex.org/I201448701']}, {'id': 'https://openalex.org/I889458895', 'display_name': 'University of Hong Kong', 'ror': 'https://ror.org/02zhqgq86', 'country_code': 'HK', 'type': 'education', 'lineage': ['https://openalex.org/I889458895']}], 'countries': ['HK', 'US'], 'is_corresponding': False, 'raw_author_name': 'Jiangtao Feng', 'raw_affiliation_strings': ['Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong'], 'affiliations': [{'raw_affiliation_string': 'Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong', 'institution_ids': ['https://openalex.org/I201448701', 'https://openalex.org/I889458895']}]}, {'author_position': 'middle', 'author': {'id': 'https://openalex.org/A5100402028', 'display_name': 'Changyuan Yu', 'orcid': 'https://orcid.org/0000-0002-3185-0441'}, 'institutions': [{'id': 'https://openalex.org/I201448701', 'display_name': 'University of Washington', 'ror': 'https://ror.org/00cvxb145', 'country_code': 'US', 'type': 'education', 'lineage': ['https://openalex.org/I201448701']}, {'id': 'https://openalex.org/I889458895', 'display_name': 'University of Hong Kong', 'ror': 'https://ror.org/02zhqgq86', 'country_code': 'HK', 'type': 'education', 'lineage': ['https://openalex.org/I889458895']}], 'countries': ['HK', 'US'], 'is_corresponding': False, 'raw_author_name': 'Tao Yu', 'raw_affiliation_strings': ['Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong'], 'affiliations': [{'raw_affiliation_string': 'Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong', 'institution_ids': ['https://openalex.org/I201448701', 'https://openalex.org/I889458895']}]}, {'author_position': 'last', 'author': {'id': 'https://openalex.org/A5014554970', 'display_name': 'Lingpeng Kong', 'orcid': 'https://orcid.org/0000-0002-9033-2724'}, 'institutions': [{'id': 'https://openalex.org/I201448701', 'display_name': 'University of Washington', 'ror': 'https://ror.org/00cvxb145', 'country_code': 'US', 'type': 'education', 'lineage': ['https://openalex.org/I201448701']}, {'id': 'https://openalex.org/I889458895', 'display_name': 'University of Hong Kong', 'ror': 'https://ror.org/02zhqgq86', 'country_code': 'HK', 'type': 'education', 'lineage': ['https://openalex.org/I889458895']}], 'countries': ['HK', 'US'], 'is_corresponding': False, 'raw_author_name': 'Lingpeng Kong', 'raw_affiliation_strings': ['Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong'], 'affiliations': [{'raw_affiliation_string': 'Shanghai AI Laboratory ♡ University of Washington ♠ The University of Hong Kong', 'institution_ids': ['https://openalex.org/I201448701', 'https://openalex.org/I889458895']}]}], 'institution_assertions': [], 'countries_distinct_count': 2, 'institutions_distinct_count': 2, 'corresponding_author_ids': [], 'corresponding_institution_ids': [], 'apc_list': None, 'apc_paid': None, 'fwci': 1.831, 'has_fulltext': True, 'fulltext_origin': 'pdf', 'cited_by_count': 10, 'citation_normalized_percentile': {'value': 0.483656, 'is_in_top_1_percent': False, 'is_in_top_10_percent': False}, 'cited_by_percentile_year': {'min': 92, 'max': 93}, 'biblio': {'volume': None, 'issue': None, 'first_page': None, 'last_page': None}, 'is_retracted': False, 'is_paratext': False, 'primary_topic': {'id': 'https://openalex.org/T10028', 'display_name': 'Natural Language Processing', 'score': 0.9969, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, 'topics': [{'id': 'https://openalex.org/T10028', 'display_name': 'Natural Language Processing', 'score': 0.9969, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10181', 'display_name': 'Statistical Machine Translation and Natural Language Processing', 'score': 0.99, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}, {'id': 'https://openalex.org/T10201', 'display_name': 'Speech Recognition Technology', 'score': 0.9709, 'subfield': {'id': 'https://openalex.org/subfields/1702', 'display_name': 'Artificial Intelligence'}, 'field': {'id': 'https://openalex.org/fields/17', 'display_name': 'Computer Science'}, 'domain': {'id': 'https://openalex.org/domains/3', 'display_name': 'Physical Sciences'}}], 'keywords': [{'id': 'https://openalex.org/keywords/baseline', 'display_name': 'Baseline (sea)', 'score': 0.66935533}, {'id': 'https://openalex.org/keywords/training-set', 'display_name': 'Training set', 'score': 0.57272065}, {'id': 'https://openalex.org/keywords/sequence-to-sequence-learning', 'display_name': 'Sequence-to-Sequence Learning', 'score': 0.497966}], 'concepts': [{'id': 'https://openalex.org/C41008148', 'wikidata': 'https://www.wikidata.org/wiki/Q21198', 'display_name': 'Computer science', 'level': 0, 'score': 0.83186257}, {'id': 'https://openalex.org/C12725497', 'wikidata': 'https://www.wikidata.org/wiki/Q810247', 'display_name': 'Baseline (sea)', 'level': 2, 'score': 0.66935533}, {'id': 'https://openalex.org/C2779343474', 'wikidata': 'https://www.wikidata.org/wiki/Q3109175', 'display_name': 'Context (archaeology)', 'level': 2, 'score': 0.6273824}, {'id': 'https://openalex.org/C2780451532', 'wikidata': 'https://www.wikidata.org/wiki/Q759676', 'display_name': 'Task (project management)', 'level': 2, 'score': 0.61286485}, {'id': 'https://openalex.org/C152124472', 'wikidata': 'https://www.wikidata.org/wiki/Q1204361', 'display_name': 'Redundancy (engineering)', 'level': 2, 'score': 0.60699064}, {'id': 'https://openalex.org/C154945302', 'wikidata': 'https://www.wikidata.org/wiki/Q11660', 'display_name': 'Artificial intelligence', 'level': 1, 'score': 0.59925747}, {'id': 'https://openalex.org/C51632099', 'wikidata': 'https://www.wikidata.org/wiki/Q3985153', 'display_name': 'Training set', 'level': 2, 'score': 0.57272065}, {'id': 'https://openalex.org/C119857082', 'wikidata': 'https://www.wikidata.org/wiki/Q2539', 'display_name': 'Machine learning', 'level': 1, 'score': 0.52625173}, {'id': 'https://openalex.org/C2778344882', 'wikidata': 'https://www.wikidata.org/wiki/Q278938', 'display_name': 'Shot (pellet)', 'level': 2, 'score': 0.5090324}, {'id': 'https://openalex.org/C124101348', 'wikidata': 'https://www.wikidata.org/wiki/Q172491', 'display_name': 'Data mining', 'level': 1, 'score': 0.35943902}, {'id': 'https://openalex.org/C153180895', 'wikidata': 'https://www.wikidata.org/wiki/Q7148389', 'display_name': 'Pattern recognition (psychology)', 'level': 2, 'score': 0.34797728}, {'id': 'https://openalex.org/C151730666', 'wikidata': 'https://www.wikidata.org/wiki/Q7205', 'display_name': 'Paleontology', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C111368507', 'wikidata': 'https://www.wikidata.org/wiki/Q43518', 'display_name': 'Oceanography', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C185592680', 'wikidata': 'https://www.wikidata.org/wiki/Q2329', 'display_name': 'Chemistry', 'level': 0, 'score': 0.0}, {'id': 'https://openalex.org/C187736073', 'wikidata': 'https://www.wikidata.org/wiki/Q2920921', 'display_name': 'Management', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C178790620', 'wikidata': 'https://www.wikidata.org/wiki/Q11351', 'display_name': 'Organic chemistry', 'level': 1, 'score': 0.0}, {'id': 'https://openalex.org/C162324750', 'wikidata': 'https://www.wikidata.org/wiki/Q8134', 'display_name': 'Economics', 'level': 0, 'score': 0.0}, {'id': 'https://openalex.org/C86803240', 'wikidata': 'https://www.wikidata.org/wiki/Q420', 'display_name': 'Biology', 'level': 0, 'score': 0.0}, {'id': 'https://openalex.org/C127313418', 'wikidata': 'https://www.wikidata.org/wiki/Q1069', 'display_name': 'Geology', 'level': 0, 'score': 0.0}, {'id': 'https://openalex.org/C111919701', 'wikidata': 'https://www.wikidata.org/wiki/Q9135', 'display_name': 'Operating system', 'level': 1, 'score': 0.0}], 'mesh': [], 'locations_count': 2, 'locations': [{'is_oa': True, 'landing_page_url': 'https://doi.org/10.18653/v1/2022.findings-emnlp.269', 'pdf_url': 'https://aclanthology.org/2022.findings-emnlp.269.pdf', 'source': None, 'license': 'cc-by', 'license_id': 'https://openalex.org/licenses/cc-by', 'version': 'publishedVersion', 'is_accepted': True, 'is_published': True}, {'is_oa': True, 'landing_page_url': 'https://arxiv.org/abs/2210.12329', 'pdf_url': 'https://arxiv.org/pdf/2210.12329', 'source': {'id': 'https://openalex.org/S4306400194', 'display_name': 'arXiv (Cornell University)', 'issn_l': None, 'issn': None, 'is_oa': True, 'is_in_doaj': False, 'is_core': False, 'host_organization': 'https://openalex.org/I205783295', 'host_organization_name': 'Cornell University', 'host_organization_lineage': ['https://openalex.org/I205783295'], 'host_organization_lineage_names': ['Cornell University'], 'type': 'repository'}, 'license': None, 'license_id': None, 'version': 'submittedVersion', 'is_accepted': False, 'is_published': False}], 'best_oa_location': {'is_oa': True, 'landing_page_url': 'https://doi.org/10.18653/v1/2022.findings-emnlp.269', 'pdf_url': 'https://aclanthology.org/2022.findings-emnlp.269.pdf', 'source': None, 'license': 'cc-by', 'license_id': 'https://openalex.org/licenses/cc-by', 'version': 'publishedVersion', 'is_accepted': True, 'is_published': True}, 'sustainable_development_goals': [{'score': 0.6, 'id': 'https://metadata.un.org/sdg/4', 'display_name': 'Quality education'}], 'grants': [], 'datasets': [], 'versions': [], 'referenced_works_count': 49, 'referenced_works': ['https://openalex.org/W1522301498', 'https://openalex.org/W2061873838', 'https://openalex.org/W2113459411', 'https://openalex.org/W2163455955', 'https://openalex.org/W2251939518', 'https://openalex.org/W2597603852', 'https://openalex.org/W2938704169', 'https://openalex.org/W2963012544', 'https://openalex.org/W2963096510', 'https://openalex.org/W2963456134', 'https://openalex.org/W2970476646', 'https://openalex.org/W2973049837', 'https://openalex.org/W2980282514', 'https://openalex.org/W2981873476', 'https://openalex.org/W2997195635', 'https://openalex.org/W2998184481', 'https://openalex.org/W3010293452', 'https://openalex.org/W3098267758', 'https://openalex.org/W3101007570', 'https://openalex.org/W3103291112', 'https://openalex.org/W3106954555', 'https://openalex.org/W3128710690', 'https://openalex.org/W3130196849', 'https://openalex.org/W3166699508', 'https://openalex.org/W3172943453', 'https://openalex.org/W3175603587', 'https://openalex.org/W3176618728', 'https://openalex.org/W3198963017', 'https://openalex.org/W3201090304', 'https://openalex.org/W3206547074', 'https://openalex.org/W4205857304', 'https://openalex.org/W4206636317', 'https://openalex.org/W4221149883', 'https://openalex.org/W4281490030', 'https://openalex.org/W4281790610', 'https://openalex.org/W4286769130', 'https://openalex.org/W4287026929', 'https://openalex.org/W4287207937', 'https://openalex.org/W4287332927', 'https://openalex.org/W4287796293', 'https://openalex.org/W4287891464', 'https://openalex.org/W4288351520', 'https://openalex.org/W4288631803', 'https://openalex.org/W4292779060', 'https://openalex.org/W4297399052', 'https://openalex.org/W4385567149', 'https://openalex.org/W4385572953', 'https://openalex.org/W4385573325', 'https://openalex.org/W4385574293'], 'related_works': ['https://openalex.org/W4319453497', 'https://openalex.org/W4306674287', 'https://openalex.org/W4306321456', 'https://openalex.org/W4294661698', 'https://openalex.org/W4286629047', 'https://openalex.org/W4281992143', 'https://openalex.org/W4224009465', 'https://openalex.org/W3119369074', 'https://openalex.org/W3092026670', 'https://openalex.org/W2961085424'], 'abstract_inverted_index': {'Recently,': [0], 'dataset-generation-based': [1], 'zero-shot': [2, 38, 98], 'learning': [3], 'has': [4], 'shown': [5], 'promising': [6], 'results': [7], 'by': [8], 'training': [9, 117], 'a': [10, 14, 78, 96], 'task-specific': [11, 25, 109], 'model': [12, 26, 110], 'with': [13, 40, 144], 'dataset': [15, 92, 99, 148], 'synthesized': [16], 'from': [17, 56, 107], 'large': [18], 'pre-trained': [19], 'language': [20], 'models': [21], '(PLMs).': [22], 'The': [23], 'final': [24], 'often': [27], 'achieves': [28, 139], 'compatible': [29], 'or': [30, 141], 'even': [31], 'better': [32, 75], 'performance': [33, 76, 143], 'than': [34], 'PLMs': [35], 'under': [36], 'the': [37, 57, 67, 84, 89, 105, 108, 113, 129, 132], 'setting,': [39], 'orders': [41], 'of': [42, 115, 131], 'magnitude': [43], 'fewer': [44], 'parameters.However,': [45], 'synthetic': [46, 69, 147], 'datasets': [47, 127], 'have': [48, 52], 'their': [49], 'drawbacks.': [50], 'They': [51], 'long': [53], 'being': [54], 'suffering': [55], 'low-quality': [58], 'issue': [59], '(e.g.,': [60], 'low': [61], 'informativeness,': [62], 'redundancy).': [63], 'This': [64], 'explains': [65], 'why': [66], 'massive': [68], 'data': [70, 118], 'does': [71], 'not': [72], 'lead': [73], 'to': [74, 111, 152], '–': [77], 'scenario': [79], 'we': [80, 94], 'would': [81], 'expect': [82], 'in': [83, 91], 'human-labeled': [85], 'data.': [86], 'To': [87], 'improve': [88], 'quality': [90], 'synthesis,': [93], 'propose': [95], 'progressive': [97], 'generation': [100, 114], 'framework,': [101], 'ProGen,': [102], 'which': [103], 'leverages': [104], 'feedback': [106], 'guide': [112], 'new': [116], 'via': [119], 'in-context': [120, 156], 'examples.Extensive': [121], 'experiments': [122], 'on': [123], 'five': [124], 'text': [125], 'classification': [126], 'demonstrate': [128], 'effectiveness': [130], 'proposed': [133], 'approach.': [134], 'We': [135], 'also': [136], 'show': [137], 'ProGen': [138], 'on-par': [140], 'superior': [142], 'only': [145], '1%': [146], 'size,': [149], 'when': [150], 'comparing': [151], 'baseline': [153], 'methods': [154], 'without': [155], 'feedback.': [157]}, 'cited_by_api_url': 'https://api.openalex.org/works?filter=cites:W4385567101', 'counts_by_year': [{'year': 2024, 'cited_by_count': 1}, {'year': 2023, 'cited_by_count': 9}], 'updated_date': '2024-09-24T08:48:58.179513', 'created_date': '2023-08-05'}
Publication Information

Basic Information

Access and Citation

AI Researcher Chatbot

Primary Location

Authors

Topics

Keywords

Related Works