\n \n \n
\n
\n\n \n \n \n \n \n \n Challenges in Context-Aware Neural Machine Translation.\n \n \n \n \n\n\n \n Jin, L., He, J., May, J., & Ma, X.\n\n\n \n\n\n\n In Bouamor, H., Pino, J., & Bali, K., editor(s),
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 15246–15263, Singapore, December 2023. Association for Computational Linguistics\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n \n \n 2 downloads\n \n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{jin-etal-2023-challenges,\n title = "Challenges in Context-Aware Neural Machine Translation",\n author = "Jin, Linghao and\n He, Jacqueline and\n May, Jonathan and\n Ma, Xuezhe",\n editor = "Bouamor, Houda and\n Pino, Juan and\n Bali, Kalika",\n booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",\n month = dec,\n year = "2023",\n address = "Singapore",\n publisher = "Association for Computational Linguistics",\n url = "https://aclanthology.org/2023.emnlp-main.943",\n doi = "10.18653/v1/2023.emnlp-main.943",\n pages = "15246--15263",\n abstract = "Context-aware neural machine translation, a paradigm that involves leveraging information beyond sentence-level context to resolve inter-sentential discourse dependencies and improve document-level translation quality, has given rise to a number of recent techniques. However, despite well-reasoned intuitions, most context-aware translation models show only modest improvements over sentence-level systems. In this work, we investigate and present several core challenges that impede progress within the field, relating to discourse phenomena, context usage, model architectures, and document-level evaluation. To address these problems, we propose a more realistic setting for document-level translation, called paragraph-to-paragraph (PARA2PARA) translation, and collect a new dataset of Chinese-English novels to promote future research.",\n}\n\n
\n
\n\n\n
\n Context-aware neural machine translation, a paradigm that involves leveraging information beyond sentence-level context to resolve inter-sentential discourse dependencies and improve document-level translation quality, has given rise to a number of recent techniques. However, despite well-reasoned intuitions, most context-aware translation models show only modest improvements over sentence-level systems. In this work, we investigate and present several core challenges that impede progress within the field, relating to discourse phenomena, context usage, model architectures, and document-level evaluation. To address these problems, we propose a more realistic setting for document-level translation, called paragraph-to-paragraph (PARA2PARA) translation, and collect a new dataset of Chinese-English novels to promote future research.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Continual Dialogue State Tracking via Example-Guided Question Answering.\n \n \n \n \n\n\n \n Cho, H., Madotto, A., Lin, Z., Chandu, K., Kottur, S., Xu, J., May, J., & Sankar, C.\n\n\n \n\n\n\n In Bouamor, H., Pino, J., & Bali, K., editor(s),
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 3873–3886, Singapore, December 2023. Association for Computational Linguistics\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n \n \n 1 download\n \n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{cho-etal-2023-continual,\n title = "Continual Dialogue State Tracking via Example-Guided Question Answering",\n author = "Cho, Hyundong and\n Madotto, Andrea and\n Lin, Zhaojiang and\n Chandu, Khyathi and\n Kottur, Satwik and\n Xu, Jing and\n May, Jonathan and\n Sankar, Chinnadhurai",\n editor = "Bouamor, Houda and\n Pino, Juan and\n Bali, Kalika",\n booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",\n month = dec,\n year = "2023",\n address = "Singapore",\n publisher = "Association for Computational Linguistics",\n url = "https://aclanthology.org/2023.emnlp-main.235",\n doi = "10.18653/v1/2023.emnlp-main.235",\n pages = "3873--3886",\n abstract = "Dialogue systems are frequently updated to accommodate new services, but naively updating them by continually training with data for new services in diminishing performance on previously learnt services. Motivated by the insight that dialogue state tracking (DST), a crucial component of dialogue systems that estimates the user{'}s goal as a conversation proceeds, is a simple natural language understanding task, we propose reformulating it as a bundle of granular example-guided question answering tasks to minimize the task shift between services and thus benefit continual learning. Our approach alleviates service-specific memorization and teaches a model to contextualize the given question and example to extract the necessary information from the conversation. We find that a model with just 60M parameters can achieve a significant boost by learning to learn from in-context examples retrieved by a retriever trained to identify turns with similar dialogue state changes. Combining our method with dialogue-level memory replay, our approach attains state of the art performance on DST continual learning metrics without relying on any complex regularization or parameter expansion methods.",\n}\n\n
\n
\n\n\n
\n Dialogue systems are frequently updated to accommodate new services, but naively updating them by continually training with data for new services in diminishing performance on previously learnt services. Motivated by the insight that dialogue state tracking (DST), a crucial component of dialogue systems that estimates the user's goal as a conversation proceeds, is a simple natural language understanding task, we propose reformulating it as a bundle of granular example-guided question answering tasks to minimize the task shift between services and thus benefit continual learning. Our approach alleviates service-specific memorization and teaches a model to contextualize the given question and example to extract the necessary information from the conversation. We find that a model with just 60M parameters can achieve a significant boost by learning to learn from in-context examples retrieved by a retriever trained to identify turns with similar dialogue state changes. Combining our method with dialogue-level memory replay, our approach attains state of the art performance on DST continual learning metrics without relying on any complex regularization or parameter expansion methods.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Analyzing Norm Violations in Live-Stream Chat.\n \n \n \n \n\n\n \n Moon, J., Lee, D., Cho, H., Jin, W., Park, C., Kim, M., May, J., Pujara, J., & Park, S.\n\n\n \n\n\n\n In Bouamor, H., Pino, J., & Bali, K., editor(s),
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 852–868, Singapore, December 2023. Association for Computational Linguistics\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{moon-etal-2023-analyzing,\n title = "Analyzing Norm Violations in Live-Stream Chat",\n author = "Moon, Jihyung and\n Lee, Dong-Ho and\n Cho, Hyundong and\n Jin, Woojeong and\n Park, Chan and\n Kim, Minwoo and\n May, Jonathan and\n Pujara, Jay and\n Park, Sungjoon",\n editor = "Bouamor, Houda and\n Pino, Juan and\n Bali, Kalika",\n booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",\n month = dec,\n year = "2023",\n address = "Singapore",\n publisher = "Association for Computational Linguistics",\n url = "https://aclanthology.org/2023.emnlp-main.55",\n doi = "10.18653/v1/2023.emnlp-main.55",\n pages = "852--868",\n abstract = "Toxic language, such as hate speech, can deter users from participating in online communities and enjoying popular platforms. Previous approaches to detecting toxic language and norm violations have been primarily concerned with conversations from online forums and social media, such as Reddit and Twitter. These approaches are less effective when applied to conversations on live-streaming platforms, such as Twitch and YouTube Live, as each comment is only visible for a limited time and lacks a thread structure that establishes its relationship with other comments. In this work, we share the first NLP study dedicated to detecting norm violations in conversations on live-streaming platforms. We define norm violation categories in live-stream chats and annotate 4,583 moderated comments from Twitch. We articulate several facets of live-stream data that differ from other forums, and demonstrate that existing models perform poorly in this setting. By conducting a user study, we identify the informational context humans use in live-stream moderation, and train models leveraging context to identify norm violations. Our results show that appropriate contextual information can boost moderation performance by 35{\\%}.",\n}\n\n
\n
\n\n\n
\n Toxic language, such as hate speech, can deter users from participating in online communities and enjoying popular platforms. Previous approaches to detecting toxic language and norm violations have been primarily concerned with conversations from online forums and social media, such as Reddit and Twitter. These approaches are less effective when applied to conversations on live-streaming platforms, such as Twitch and YouTube Live, as each comment is only visible for a limited time and lacks a thread structure that establishes its relationship with other comments. In this work, we share the first NLP study dedicated to detecting norm violations in conversations on live-streaming platforms. We define norm violation categories in live-stream chats and annotate 4,583 moderated comments from Twitch. We articulate several facets of live-stream data that differ from other forums, and demonstrate that existing models perform poorly in this setting. By conducting a user study, we identify the informational context humans use in live-stream moderation, and train models leveraging context to identify norm violations. Our results show that appropriate contextual information can boost moderation performance by 35%.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Identifying Informational Sources in News Articles.\n \n \n \n \n\n\n \n Spangher, A., Peng, N., Ferrara, E., & May, J.\n\n\n \n\n\n\n In Bouamor, H., Pino, J., & Bali, K., editor(s),
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 3626–3639, Singapore, December 2023. Association for Computational Linguistics\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{spangher-etal-2023-identifying,\n title = "Identifying Informational Sources in News Articles",\n author = "Spangher, Alexander and\n Peng, Nanyun and\n Ferrara, Emilio and\n May, Jonathan",\n editor = "Bouamor, Houda and\n Pino, Juan and\n Bali, Kalika",\n booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",\n month = dec,\n year = "2023",\n address = "Singapore",\n publisher = "Association for Computational Linguistics",\n url = "https://aclanthology.org/2023.emnlp-main.221",\n doi = "10.18653/v1/2023.emnlp-main.221",\n pages = "3626--3639",\n abstract = "News articles are driven by the informational sources journalists use in reporting. Modeling when, how and why sources get used together in stories can help us better understand the information we consume and even help journalists with the task of producing it. In this work, we take steps toward this goal by constructing the largest and widest-ranging annotated dataset, to date, of informational sources used in news writing. We first show that our dataset can be used to train high-performing models for information detection and source attribution. Then, we introduce a novel task, source prediction, to study the compositionality of sources in news articles {--} i.e. how they are chosen to complement each other. We show good modeling performance on this task, indicating that there is a pattern to the way different sources are used \\textit{together} in news storytelling. This insight opens the door for a focus on sources in narrative science (i.e. planning-based language generation) and computational journalism (i.e. a source-recommendation system to aid journalists writing stories). All data and model code can be found at https://github.com/alex2awesome/source-exploration.",\n}\n\n
\n
\n\n\n
\n News articles are driven by the informational sources journalists use in reporting. Modeling when, how and why sources get used together in stories can help us better understand the information we consume and even help journalists with the task of producing it. In this work, we take steps toward this goal by constructing the largest and widest-ranging annotated dataset, to date, of informational sources used in news writing. We first show that our dataset can be used to train high-performing models for information detection and source attribution. Then, we introduce a novel task, source prediction, to study the compositionality of sources in news articles – i.e. how they are chosen to complement each other. We show good modeling performance on this task, indicating that there is a pattern to the way different sources are used together in news storytelling. This insight opens the door for a focus on sources in narrative science (i.e. planning-based language generation) and computational journalism (i.e. a source-recommendation system to aid journalists writing stories). All data and model code can be found at https://github.com/alex2awesome/source-exploration.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Feedback Loops and Complex Dynamics of Harmful Speech in Online Discussions.\n \n \n \n \n\n\n \n Chang, R., May, J., & Lerman, K.\n\n\n \n\n\n\n In
Social, Cultural, and Behavioral Modeling: 16th International Conference, SBP-BRiMS 2023, Pittsburgh, PA, USA, September 20–22, 2023, Proceedings, pages 85–94, Berlin, Heidelberg, 2023. Springer-Verlag\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{10.1007/978-3-031-43129-6_9,\nauthor = {Chang, Rong-Ching and May, Jonathan and Lerman, Kristina},\ntitle = {Feedback Loops and Complex Dynamics of Harmful Speech in Online Discussions},\nyear = {2023},\nisbn = {978-3-031-43128-9},\npublisher = {Springer-Verlag},\naddress = {Berlin, Heidelberg},\nurl = {https://doi.org/10.1007/978-3-031-43129-6_9},\ndoi = {10.1007/978-3-031-43129-6_9},\nabstract = {Harmful and toxic speech contribute to an unwelcoming online environment that suppresses participation and conversation. Efforts have focused on detecting and mitigating harmful speech; however, the mechanisms by which toxicity degrades online discussions are not well understood. This paper makes two contributions. First, to comprehensively model harmful comments, we introduce a multilingual misogyny and sexist speech detection model (). Second, we model the complex dynamics of online discussions as feedback loops in which harmful comments lead to negative emotions which prompt even more harmful comments. To quantify the feedback loops, we use a combination of mutual Granger causality and regression to analyze discussions on two political forums on Reddit: the moderated political forum r/Politics and the moderated neutral political forum r/NeutralPolitics. Our results suggest that harmful comments and negative emotions create self-reinforcing feedback loops in forums. Contrarily, moderation with neutral discussion appears to tip interactions into self-extinguishing feedback loops that reduce harmful speech and negative emotions. Our study sheds more light on the complex dynamics of harmful speech and the role of moderation and neutral discussion in mitigating these dynamics.},\nbooktitle = {Social, Cultural, and Behavioral Modeling: 16th International Conference, SBP-BRiMS 2023, Pittsburgh, PA, USA, September 20–22, 2023, Proceedings},\npages = {85–94},\nnumpages = {10},\nkeywords = {Feedback Loop, Moderation, Granger Causality},\nlocation = {Pittsburgh, PA, USA}\n}\n\n
\n
\n\n\n
\n Harmful and toxic speech contribute to an unwelcoming online environment that suppresses participation and conversation. Efforts have focused on detecting and mitigating harmful speech; however, the mechanisms by which toxicity degrades online discussions are not well understood. This paper makes two contributions. First, to comprehensively model harmful comments, we introduce a multilingual misogyny and sexist speech detection model (). Second, we model the complex dynamics of online discussions as feedback loops in which harmful comments lead to negative emotions which prompt even more harmful comments. To quantify the feedback loops, we use a combination of mutual Granger causality and regression to analyze discussions on two political forums on Reddit: the moderated political forum r/Politics and the moderated neutral political forum r/NeutralPolitics. Our results suggest that harmful comments and negative emotions create self-reinforcing feedback loops in forums. Contrarily, moderation with neutral discussion appears to tip interactions into self-extinguishing feedback loops that reduce harmful speech and negative emotions. Our study sheds more light on the complex dynamics of harmful speech and the role of moderation and neutral discussion in mitigating these dynamics.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n First Steps Towards a Source Recommendation Engine: Investigating How Sources Are Used in News Articles.\n \n \n \n \n\n\n \n \n\n\n \n\n\n\n Zurich, Switzerland, June 2023.\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@Proceedings{spangher23.djc,\n title = {First Steps Towards a Source Recommendation Engine:\nInvestigating How Sources Are Used in News Articles},\n year = 2023,\n url={https://www.datajconf.com/papers/CJ_DataJConf_2023_paper_74.pdf},\n booktitle = {Proc. The Joint Computation + Journalism European Data \\& Computational Journalism Conference},\n address = {Zurich, Switzerland},\n month = {June}}\n\n
\n
\n\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Blend and Match: Distilling Semantic Search Models with Different Inductive Biases and Model Architectures.\n \n \n \n \n\n\n \n Bonab, H., Joshi, A., Bhatia, R., Gandhi, A., Huddar, V., Naik, J., Al-Darabsah, M., Teo, C. H., May, J., Agarwal, T., & Petricek, V.\n\n\n \n\n\n\n In
Companion Proceedings of the ACM Web Conference 2023, of
WWW '23 Companion, pages 869–877, New York, NY, USA, 2023. Association for Computing Machinery\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{10.1145/3543873.3587629,\nauthor = {Bonab, Hamed and Joshi, Ashutosh and Bhatia, Ravi and Gandhi, Ankit and Huddar, Vijay and Naik, Juhi and Al-Darabsah, Mutasem and Teo, Choon Hui and May, Jonathan and Agarwal, Tarun and Petricek, Vaclav},\ntitle = {Blend and Match: Distilling Semantic Search Models with Different Inductive Biases and Model Architectures},\nyear = {2023},\nisbn = {9781450394192},\npublisher = {Association for Computing Machinery},\naddress = {New York, NY, USA},\nurl = {https://doi.org/10.1145/3543873.3587629},\ndoi = {10.1145/3543873.3587629},\nabstract = {Commercial search engines use different semantic models to augment lexical matches. These models provide candidate items for a user’s query from a target space of millions to billions of items. Models with different inductive biases provide relatively different predictions, making it desirable to launch multiple semantic models in production. However, latency and resource constraints make simultaneously deploying multiple models impractical. In this paper, we introduce a distillation approach, called Blend and Match (BM), to unify two different semantic search models into a single model. We use a Bi-encoder semantic matching model as our primary model and propose a novel loss function to incorporate eXtreme Multi-label Classification (XMC) predictions as the secondary model. Our experiments conducted on two large-scale datasets, collected from a popular e-commerce store, show that our proposed approach significantly improves the recall of the primary Bi-encoder model by 11\\% to 17\\% with a minimal loss in precision. We show that traditional knowledge distillation approaches result in a sub-optimal performance for our problem setting, and our BM approach yields comparable rankings with strong Rank Fusion (RF) methods used only if one could deploy multiple models.},\nbooktitle = {Companion Proceedings of the ACM Web Conference 2023},\npages = {869–877},\nnumpages = {9},\nkeywords = {Semantic Search, Ranking Distillation, Product Search, Model Blending},\nlocation = {Austin, TX, USA},\nseries = {WWW '23 Companion}\n}\n\n\n
\n
\n\n\n
\n Commercial search engines use different semantic models to augment lexical matches. These models provide candidate items for a user’s query from a target space of millions to billions of items. Models with different inductive biases provide relatively different predictions, making it desirable to launch multiple semantic models in production. However, latency and resource constraints make simultaneously deploying multiple models impractical. In this paper, we introduce a distillation approach, called Blend and Match (BM), to unify two different semantic search models into a single model. We use a Bi-encoder semantic matching model as our primary model and propose a novel loss function to incorporate eXtreme Multi-label Classification (XMC) predictions as the secondary model. Our experiments conducted on two large-scale datasets, collected from a popular e-commerce store, show that our proposed approach significantly improves the recall of the primary Bi-encoder model by 11% to 17% with a minimal loss in precision. We show that traditional knowledge distillation approaches result in a sub-optimal performance for our problem setting, and our BM approach yields comparable rankings with strong Rank Fusion (RF) methods used only if one could deploy multiple models.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Bridging the Gap between Native Text and Translated Text through Adversarial Learning: A Case Study on Cross-Lingual Event Extraction.\n \n \n \n \n\n\n \n Yu, P., May, J., & Ji, H.\n\n\n \n\n\n\n In
Findings of the Association for Computational Linguistics: EACL 2023, pages 754–769, Dubrovnik, Croatia, May 2023. Association for Computational Linguistics\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{yu-etal-2023-bridging,\n title = "Bridging the Gap between Native Text and Translated Text through Adversarial Learning: A Case Study on Cross-Lingual Event Extraction",\n author = "Yu, Pengfei and\n May, Jonathan and\n Ji, Heng",\n booktitle = "Findings of the Association for Computational Linguistics: EACL 2023",\n month = may,\n year = "2023",\n address = "Dubrovnik, Croatia",\n publisher = "Association for Computational Linguistics",\n url = "https://aclanthology.org/2023.findings-eacl.57",\n doi = "10.18653/v1/2023.findings-eacl.57",\n pages = "754--769",\n abstract = "Recent research in cross-lingual learning has found that combining large-scale pretrained multilingual language models with machine translation can yield good performance. We explore this idea for cross-lingual event extraction with a new model architecture that jointly encodes a source language input sentence with its translation to the target language during training, and takes a target language sentence with its translation back to the source language as input during evaluation. However, we observe significant representational gap between the native source language texts during training and the texts translated into source language during evaluation, as well as the texts translated into target language during training and the native target language texts during evaluation. This representational gap undermines the effectiveness of cross-lingual transfer learning for event extraction with machine-translated data. In order to mitigate this problem, we propose an adversarial training framework that encourages the language model to produce more similar representations for the translated text and the native text. To be specific, we train the language model such that its hidden representations are able to fool a jointly trained discriminator that distinguishes translated texts{'} representations from native texts{'} representations. We conduct experiments on cross-lingual for event extraction across three languages. Results demonstrate that our proposed adversarial training can effectively incorporate machine translation to improve event extraction, while simply adding machine-translated data yields unstable performance due to the representational gap.",\n}\n\n
\n
\n\n\n
\n Recent research in cross-lingual learning has found that combining large-scale pretrained multilingual language models with machine translation can yield good performance. We explore this idea for cross-lingual event extraction with a new model architecture that jointly encodes a source language input sentence with its translation to the target language during training, and takes a target language sentence with its translation back to the source language as input during evaluation. However, we observe significant representational gap between the native source language texts during training and the texts translated into source language during evaluation, as well as the texts translated into target language during training and the native target language texts during evaluation. This representational gap undermines the effectiveness of cross-lingual transfer learning for event extraction with machine-translated data. In order to mitigate this problem, we propose an adversarial training framework that encourages the language model to produce more similar representations for the translated text and the native text. To be specific, we train the language model such that its hidden representations are able to fool a jointly trained discriminator that distinguishes translated texts' representations from native texts' representations. We conduct experiments on cross-lingual for event extraction across three languages. Results demonstrate that our proposed adversarial training can effectively incorporate machine translation to improve event extraction, while simply adding machine-translated data yields unstable performance due to the representational gap.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n RECAP: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized Dialogue Response Generation.\n \n \n \n \n\n\n \n Liu, S., Cho, H., Freedman, M., Ma, X., & May, J.\n\n\n \n\n\n\n In
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8404–8419, Toronto, Canada, July 2023. Association for Computational Linguistics\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{liu-etal-2023-recap,\n title = "{RECAP}: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized Dialogue Response Generation",\n author = "Liu, Shuai and\n Cho, Hyundong and\n Freedman, Marjorie and\n Ma, Xuezhe and\n May, Jonathan",\n booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",\n month = jul,\n year = "2023",\n address = "Toronto, Canada",\n publisher = "Association for Computational Linguistics",\n url = "https://aclanthology.org/2023.acl-long.468",\n doi = "10.18653/v1/2023.acl-long.468",\n pages = "8404--8419",\n abstract = "Endowing chatbots with a consistent persona is essential to an engaging conversation, yet it remains an unresolved challenge. In this work, we propose a new retrieval-enhanced approach for personalized response generation. Specifically, we design a hierarchical transformer retriever trained on dialogue domain data to perform personalized retrieval and a context-aware prefix encoder that fuses the retrieved information to the decoder more effectively. Extensive experiments on a real-world dataset demonstrate the effectiveness of our model at generating more fluent and personalized responses. We quantitatively evaluate our model{'}s performance under a suite of human and automatic metrics and find it to be superior compared to state-of-the-art baselines on English Reddit conversations.",\n}\n\n\n
\n
\n\n\n
\n Endowing chatbots with a consistent persona is essential to an engaging conversation, yet it remains an unresolved challenge. In this work, we propose a new retrieval-enhanced approach for personalized response generation. Specifically, we design a hierarchical transformer retriever trained on dialogue domain data to perform personalized retrieval and a context-aware prefix encoder that fuses the retrieved information to the decoder more effectively. Extensive experiments on a real-world dataset demonstrate the effectiveness of our model at generating more fluent and personalized responses. We quantitatively evaluate our model's performance under a suite of human and automatic metrics and find it to be superior compared to state-of-the-art baselines on English Reddit conversations.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Anger Breeds Controversy: Analyzing Controversy and Emotions on Reddit.\n \n \n \n \n\n\n \n Chen, K., He, Z., Chang, R., May, J., & Lerman, K.\n\n\n \n\n\n\n In Thomson, R., Al-khateeb, S., Burger, A., Park, P., & A. Pyke, A., editor(s),
Social, Cultural, and Behavioral Modeling, pages 44–53, Cham, 2023. Springer Nature Switzerland\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@InProceedings{10.1007/978-3-031-43129-6_5,\nauthor="Chen, Kai and He, Zihao and Chang, Rong-Ching and May, Jonathan and Lerman, Kristina",\neditor="Thomson, Robert and Al-khateeb, Samer and Burger, Annetta\nand Park, Patrick\nand A. Pyke, Aryn",\ntitle="Anger Breeds Controversy: Analyzing Controversy and Emotions on Reddit",\nbooktitle="Social, Cultural, and Behavioral Modeling",\nyear="2023",\npublisher="Springer Nature Switzerland",\naddress="Cham",\npages="44--53",\nurl = {https://arxiv.org/abs/2212.00339},\nabstract="Emotions play an important role in interpersonal interactions and social conflict, yet their function in the development of controversy and disagreement in online conversations has not been fully explored. To address this gap, we study controversy on Reddit, a popular network of online discussion forums. We collect discussions from various topical forums and use emotion detection to recognize a range of emotions from text, including anger, fear, joy, admiration, etc. (Code and dataset are publicly available at https://github.com/ChenK7166/controversy-emotion). We find controversial comments express more anger and less admiration, joy, and optimism than non-controversial comments. Moreover, controversial comments affect emotions of downstream comments, resulting in a long-term increase in anger and a decrease in positive emotions. The magnitude and direction of emotional change differ by forum. Finally, we show that emotions help better predict which comments will become controversial. Understanding the dynamics of emotions in online discussions can help communities to manage conversations better.",\nisbn="978-3-031-43129-6"\n}\n\n\n
\n
\n\n\n
\n Emotions play an important role in interpersonal interactions and social conflict, yet their function in the development of controversy and disagreement in online conversations has not been fully explored. To address this gap, we study controversy on Reddit, a popular network of online discussion forums. We collect discussions from various topical forums and use emotion detection to recognize a range of emotions from text, including anger, fear, joy, admiration, etc. (Code and dataset are publicly available at https://github.com/ChenK7166/controversy-emotion). We find controversial comments express more anger and less admiration, joy, and optimism than non-controversial comments. Moreover, controversial comments affect emotions of downstream comments, resulting in a long-term increase in anger and a decrease in positive emotions. The magnitude and direction of emotional change differ by forum. Finally, we show that emotions help better predict which comments will become controversial. Understanding the dynamics of emotions in online discussions can help communities to manage conversations better.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Cross-lingual Continual Learning.\n \n \n \n \n\n\n \n M'hamdi, M., Ren, X., & May, J.\n\n\n \n\n\n\n In
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3908–3943, Toronto, Canada, July 2023. Association for Computational Linguistics\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{mhamdi-etal-2023-cross,\n title = "Cross-lingual Continual Learning",\n author = "M{'}hamdi, Meryem and\n Ren, Xiang and\n May, Jonathan",\n booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",\n month = jul,\n year = "2023",\n address = "Toronto, Canada",\n publisher = "Association for Computational Linguistics",\n url = "https://aclanthology.org/2023.acl-long.217",\n doi = "10.18653/v1/2023.acl-long.217",\n pages = "3908--3943",\n abstract = "The longstanding goal of multi-lingual learning has been to develop a universal cross-lingual model that can withstand the changes in multi-lingual data distributions. There has been a large amount of work to adapt such multi-lingual models to unseen target languages. However, the majority of work in this direction focuses on the standard one-hop transfer learning pipeline from source to target languages, whereas in realistic scenarios, new languages can be incorporated at any time in a sequential manner. In this paper, we present a principled Cross-lingual Continual Learning (CCL) evaluation paradigm, where we analyze different categories of approaches used to continually adapt to emerging data from different languages. We provide insights into what makes multilingual sequential learning particularly challenging. To surmount such challenges, we benchmark a representative set of cross-lingual continual learning algorithms and analyze their knowledge preservation, accumulation, and generalization capabilities compared to baselines on carefully curated datastreams. The implications of this analysis include a recipe for how to measure and balance different cross-lingual continual learning desiderata, which go beyond conventional transfer learning.",\n}\n\n
\n
\n\n\n
\n The longstanding goal of multi-lingual learning has been to develop a universal cross-lingual model that can withstand the changes in multi-lingual data distributions. There has been a large amount of work to adapt such multi-lingual models to unseen target languages. However, the majority of work in this direction focuses on the standard one-hop transfer learning pipeline from source to target languages, whereas in realistic scenarios, new languages can be incorporated at any time in a sequential manner. In this paper, we present a principled Cross-lingual Continual Learning (CCL) evaluation paradigm, where we analyze different categories of approaches used to continually adapt to emerging data from different languages. We provide insights into what makes multilingual sequential learning particularly challenging. To surmount such challenges, we benchmark a representative set of cross-lingual continual learning algorithms and analyze their knowledge preservation, accumulation, and generalization capabilities compared to baselines on carefully curated datastreams. The implications of this analysis include a recipe for how to measure and balance different cross-lingual continual learning desiderata, which go beyond conventional transfer learning.\n
\n\n\n
\n\n\n
\n\n\n \n\n\n\n\n\n