UZH_CLyp at SemEval-2023 Task 9: Head-First Fine-Tuning and ChatGPT Data Generation for Cross-Lingual Learning in Tweet Intimacy Prediction. Andrianos Michail, Konstantinou, S., & Clematide, S. arXiv.org, March, 2023. Place: Ithaca Publisher: Cornell University Library, arXiv.org
UZH_CLyp at SemEval-2023 Task 9: Head-First Fine-Tuning and ChatGPT Data Generation for Cross-Lingual Learning in Tweet Intimacy Prediction [link]Paper  abstract   bibtex   
This paper describes the submission of UZH_CLyp for the SemEval 2023 Task 9 "Multilingual Tweet Intimacy Analysis". We achieved second-best results in all 10 languages according to the official Pearson's correlation regression evaluation measure. Our cross-lingual transfer learning approach explores the benefits of using a Head-First Fine-Tuning method (HeFiT) that first updates only the regression head parameters and then also updates the pre-trained transformer encoder parameters at a reduced learning rate. Additionally, we study the impact of using a small set of automatically generated examples (in our case, from ChatGPT) for low-resource settings where no human-labeled data is available. Our study shows that HeFiT stabilizes training and consistently improves results for pre-trained models that lack domain adaptation to tweets. Our study also shows a noticeable performance increase in cross-lingual learning when synthetic data is used, confirming the usefulness of current text generation systems to improve zero-shot baseline results. Finally, we examine how possible inconsistencies in the annotated data contribute to cross-lingual interference issues.
@article{andrianos_michail_uzh_clyp_2023,
	title = {{UZH}\_CLyp at {SemEval}-2023 {Task} 9: {Head}-{First} {Fine}-{Tuning} and {ChatGPT} {Data} {Generation} for {Cross}-{Lingual} {Learning} in {Tweet} {Intimacy} {Prediction}},
	url = {https://www.proquest.com/working-papers/uzh-clyp-at-semeval-2023-task-9-head-first-fine/docview/2782019528/se-2},
	abstract = {This paper describes the submission of UZH\_CLyp for the SemEval 2023 Task 9 "Multilingual Tweet Intimacy Analysis". We achieved second-best results in all 10 languages according to the official Pearson's correlation regression evaluation measure. Our cross-lingual transfer learning approach explores the benefits of using a Head-First Fine-Tuning method (HeFiT) that first updates only the regression head parameters and then also updates the pre-trained transformer encoder parameters at a reduced learning rate. Additionally, we study the impact of using a small set of automatically generated examples (in our case, from ChatGPT) for low-resource settings where no human-labeled data is available. Our study shows that HeFiT stabilizes training and consistently improves results for pre-trained models that lack domain adaptation to tweets. Our study also shows a noticeable performance increase in cross-lingual learning when synthetic data is used, confirming the usefulness of current text generation systems to improve zero-shot baseline results. Finally, we examine how possible inconsistencies in the annotated data contribute to cross-lingual interference issues.},
	language = {English},
	journal = {arXiv.org},
	author = {{Andrianos Michail} and Konstantinou, Stefanos and Clematide, Simon},
	month = mar,
	year = {2023},
	note = {Place: Ithaca
Publisher: Cornell University Library, arXiv.org},
	keywords = {Artificial intelligence, Chatbots, Artificial Intelligence, Learning, Business And Economics--Banking And Finance, Computation and Language, Coders, Parameters},
	annote = {Copyright - © 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”).  Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.},
	annote = {Última actualización - 2023-03-04},
}

Downloads: 0