MetaReflection: Learning Instructions for Language Agents using Past Reflections

MetaReflection: Learning Instructions for Language Agents using Past Reflections. Gupta, P., Kirtania, S., Singha, A., Gulwani, S., Radhakrishna, A., Soares, G., & Shi, S. In Al-Onaizan, Y., Bansal, M., & Chen, Y., editors, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 8369–8385, Miami, Florida, USA, November, 2024. Association for Computational Linguistics.

Paper doi abstract bibtex

The popularity of Large Language Models (LLMs) have unleashed a new age of Language Agents for solving a diverse range of tasks. While contemporary frontier LLMs are capable enough to power reasonably good Language agents, the closed-API model makes it hard to improve in cases they perform sub-optimally. To address this, recent works have explored techniques to improve their performance using self reflection and prompt optimization techniques. While techniques like self reflection work well in an online setup, contemporary prompt optimization techniques are designed to work on simpler tasks. To address this, we introduce METAREFLECTION, a novel offline reinforcement learning technique that enhances the performance of Language Agents by augmenting a semantic memory based on experiential learnings from past trials. We demonstrate the efficacy of METAREFLECTION by evaluating across multiple domains, including complex logical reasoning, biomedical semantic similarity, open world question answering, and vulnerability threat detection, in Infrastructure-as-Code, with different agent design. METAREFLECTION boosts Language agents' performance by 4 % to 16.82 % over the raw GPT-4 baseline and performs on par with existing state-of-the-art prompt optimization techniques while requiring fewer LLM calls.

@inproceedings{gupta_metareflection_2024,
	address = {Miami, Florida, USA},
	title = {{MetaReflection}: {Learning} {Instructions} for {Language} {Agents} using {Past} {Reflections}},
	shorttitle = {{MetaReflection}},
	url = {https://aclanthology.org/2024.emnlp-main.477/},
	doi = {10.18653/v1/2024.emnlp-main.477},
	abstract = {The popularity of Large Language Models (LLMs) have unleashed a new age of Language Agents for solving a diverse range of tasks. While contemporary frontier LLMs are capable enough to power reasonably good Language agents, the closed-API model makes it hard to improve in cases they perform sub-optimally. To address this, recent works have explored techniques to improve their performance using self reflection and prompt optimization techniques. While techniques like self reflection work well in an online setup, contemporary prompt optimization techniques are designed to work on simpler tasks. To address this, we introduce METAREFLECTION, a novel offline reinforcement learning technique that enhances the performance of Language Agents by augmenting a semantic memory based on experiential learnings from past trials. We demonstrate the efficacy of METAREFLECTION by evaluating across multiple domains, including complex logical reasoning, biomedical semantic similarity, open world question answering, and vulnerability threat detection, in Infrastructure-as-Code, with different agent design. METAREFLECTION boosts Language agents' performance by 4 \% to 16.82 \% over the raw GPT-4 baseline and performs on par with existing state-of-the-art prompt optimization techniques while requiring fewer LLM calls.},
	urldate = {2025-02-06},
	booktitle = {Proceedings of the 2024 {Conference} on {Empirical} {Methods} in {Natural} {Language} {Processing}},
	publisher = {Association for Computational Linguistics},
	author = {Gupta, Priyanshu and Kirtania, Shashank and Singha, Ananya and Gulwani, Sumit and Radhakrishna, Arjun and Soares, Gustavo and Shi, Sherry},
	editor = {Al-Onaizan, Yaser and Bansal, Mohit and Chen, Yun-Nung},
	month = nov,
	year = {2024},
	keywords = {Agents},
	pages = {8369--8385},
}

Downloads: 0

{"_id":"2rRHCAWcdeFqBcBKA","bibbaseid":"gupta-kirtania-singha-gulwani-radhakrishna-soares-shi-metareflectionlearninginstructionsforlanguageagentsusingpastreflections-2024","author_short":["Gupta, P.","Kirtania, S.","Singha, A.","Gulwani, S.","Radhakrishna, A.","Soares, G.","Shi, S."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","address":"Miami, Florida, USA","title":"MetaReflection: Learning Instructions for Language Agents using Past Reflections","shorttitle":"MetaReflection","url":"https://aclanthology.org/2024.emnlp-main.477/","doi":"10.18653/v1/2024.emnlp-main.477","abstract":"The popularity of Large Language Models (LLMs) have unleashed a new age of Language Agents for solving a diverse range of tasks. While contemporary frontier LLMs are capable enough to power reasonably good Language agents, the closed-API model makes it hard to improve in cases they perform sub-optimally. To address this, recent works have explored techniques to improve their performance using self reflection and prompt optimization techniques. While techniques like self reflection work well in an online setup, contemporary prompt optimization techniques are designed to work on simpler tasks. To address this, we introduce METAREFLECTION, a novel offline reinforcement learning technique that enhances the performance of Language Agents by augmenting a semantic memory based on experiential learnings from past trials. We demonstrate the efficacy of METAREFLECTION by evaluating across multiple domains, including complex logical reasoning, biomedical semantic similarity, open world question answering, and vulnerability threat detection, in Infrastructure-as-Code, with different agent design. METAREFLECTION boosts Language agents' performance by 4 % to 16.82 % over the raw GPT-4 baseline and performs on par with existing state-of-the-art prompt optimization techniques while requiring fewer LLM calls.","urldate":"2025-02-06","booktitle":"Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing","publisher":"Association for Computational Linguistics","author":[{"propositions":[],"lastnames":["Gupta"],"firstnames":["Priyanshu"],"suffixes":[]},{"propositions":[],"lastnames":["Kirtania"],"firstnames":["Shashank"],"suffixes":[]},{"propositions":[],"lastnames":["Singha"],"firstnames":["Ananya"],"suffixes":[]},{"propositions":[],"lastnames":["Gulwani"],"firstnames":["Sumit"],"suffixes":[]},{"propositions":[],"lastnames":["Radhakrishna"],"firstnames":["Arjun"],"suffixes":[]},{"propositions":[],"lastnames":["Soares"],"firstnames":["Gustavo"],"suffixes":[]},{"propositions":[],"lastnames":["Shi"],"firstnames":["Sherry"],"suffixes":[]}],"editor":[{"propositions":[],"lastnames":["Al-Onaizan"],"firstnames":["Yaser"],"suffixes":[]},{"propositions":[],"lastnames":["Bansal"],"firstnames":["Mohit"],"suffixes":[]},{"propositions":[],"lastnames":["Chen"],"firstnames":["Yun-Nung"],"suffixes":[]}],"month":"November","year":"2024","keywords":"Agents","pages":"8369–8385","bibtex":"@inproceedings{gupta_metareflection_2024,\n\taddress = {Miami, Florida, USA},\n\ttitle = {{MetaReflection}: {Learning} {Instructions} for {Language} {Agents} using {Past} {Reflections}},\n\tshorttitle = {{MetaReflection}},\n\turl = {https://aclanthology.org/2024.emnlp-main.477/},\n\tdoi = {10.18653/v1/2024.emnlp-main.477},\n\tabstract = {The popularity of Large Language Models (LLMs) have unleashed a new age of Language Agents for solving a diverse range of tasks. While contemporary frontier LLMs are capable enough to power reasonably good Language agents, the closed-API model makes it hard to improve in cases they perform sub-optimally. To address this, recent works have explored techniques to improve their performance using self reflection and prompt optimization techniques. While techniques like self reflection work well in an online setup, contemporary prompt optimization techniques are designed to work on simpler tasks. To address this, we introduce METAREFLECTION, a novel offline reinforcement learning technique that enhances the performance of Language Agents by augmenting a semantic memory based on experiential learnings from past trials. We demonstrate the efficacy of METAREFLECTION by evaluating across multiple domains, including complex logical reasoning, biomedical semantic similarity, open world question answering, and vulnerability threat detection, in Infrastructure-as-Code, with different agent design. METAREFLECTION boosts Language agents' performance by 4 \\% to 16.82 \\% over the raw GPT-4 baseline and performs on par with existing state-of-the-art prompt optimization techniques while requiring fewer LLM calls.},\n\turldate = {2025-02-06},\n\tbooktitle = {Proceedings of the 2024 {Conference} on {Empirical} {Methods} in {Natural} {Language} {Processing}},\n\tpublisher = {Association for Computational Linguistics},\n\tauthor = {Gupta, Priyanshu and Kirtania, Shashank and Singha, Ananya and Gulwani, Sumit and Radhakrishna, Arjun and Soares, Gustavo and Shi, Sherry},\n\teditor = {Al-Onaizan, Yaser and Bansal, Mohit and Chen, Yun-Nung},\n\tmonth = nov,\n\tyear = {2024},\n\tkeywords = {Agents},\n\tpages = {8369--8385},\n}\n\n\n\n","author_short":["Gupta, P.","Kirtania, S.","Singha, A.","Gulwani, S.","Radhakrishna, A.","Soares, G.","Shi, S."],"editor_short":["Al-Onaizan, Y.","Bansal, M.","Chen, Y."],"key":"gupta_metareflection_2024","id":"gupta_metareflection_2024","bibbaseid":"gupta-kirtania-singha-gulwani-radhakrishna-soares-shi-metareflectionlearninginstructionsforlanguageagentsusingpastreflections-2024","role":"author","urls":{"Paper":"https://aclanthology.org/2024.emnlp-main.477/"},"keyword":["Agents"],"metadata":{"authorlinks":{}}},"bibtype":"inproceedings","biburl":"https://bibbase.org/zotero/abhishek-p","dataSources":["h7kKWXpJh2iaX92T5"],"keywords":["agents"],"search_terms":["metareflection","learning","instructions","language","agents","using","past","reflections","gupta","kirtania","singha","gulwani","radhakrishna","soares","shi"],"title":"MetaReflection: Learning Instructions for Language Agents using Past Reflections","year":2024}