WARP: Word-level Adversarial ReProgramming

WARP: Word-level Adversarial ReProgramming. Hambardzumyan, K., Khachatrian, H., & May, J. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4921–4933, Online, August, 2021. Association for Computational Linguistics.

Paper abstract bibtex

Transfer learning from pretrained language models recently became the dominant approach for solving many NLP tasks. A common approach to transfer learning for multiple tasks that maximize parameter sharing trains one or more task-specific layers on top of the language model. In this paper, we present an alternative approach based on adversarial reprogramming, which extends earlier work on automatic prompt generation. Adversarial reprogramming attempts to learn task-specific word embeddings that, when concatenated to the input text, instruct the language model to solve the specified task. Using up to 25K trainable parameters per task, this approach outperforms all existing methods with up to 25M trainable parameters on the public leaderboard of the GLUE benchmark. Our method, initialized with task-specific human-readable prompts, also works in a few-shot setting, outperforming GPT-3 on two SuperGLUE tasks with just 32 training samples.

@inproceedings{hambardzumyan-etal-2021-warp,
    title = "{WARP}: {W}ord-level {A}dversarial {R}e{P}rogramming",
    author = "Hambardzumyan, Karen  and
      Khachatrian, Hrant  and
      May, Jonathan",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-long.381",
    pages = "4921--4933",
    abstract = "Transfer learning from pretrained language models recently became the dominant approach for solving many NLP tasks. A common approach to transfer learning for multiple tasks that maximize parameter sharing trains one or more task-specific layers on top of the language model. In this paper, we present an alternative approach based on adversarial reprogramming, which extends earlier work on automatic prompt generation. Adversarial reprogramming attempts to learn task-specific word embeddings that, when concatenated to the input text, instruct the language model to solve the specified task. Using up to 25K trainable parameters per task, this approach outperforms all existing methods with up to 25M trainable parameters on the public leaderboard of the GLUE benchmark. Our method, initialized with task-specific human-readable prompts, also works in a few-shot setting, outperforming GPT-3 on two SuperGLUE tasks with just 32 training samples.",
}

Downloads: 0

{"_id":"KSuLTpA9tdgPuqJkG","bibbaseid":"hambardzumyan-khachatrian-may-warpwordleveladversarialreprogramming-2021","author_short":["Hambardzumyan, K.","Khachatrian, H.","May, J."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","title":"WARP: Word-level Adversarial ReProgramming","author":[{"propositions":[],"lastnames":["Hambardzumyan"],"firstnames":["Karen"],"suffixes":[]},{"propositions":[],"lastnames":["Khachatrian"],"firstnames":["Hrant"],"suffixes":[]},{"propositions":[],"lastnames":["May"],"firstnames":["Jonathan"],"suffixes":[]}],"booktitle":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","month":"August","year":"2021","address":"Online","publisher":"Association for Computational Linguistics","url":"https://aclanthology.org/2021.acl-long.381","pages":"4921–4933","abstract":"Transfer learning from pretrained language models recently became the dominant approach for solving many NLP tasks. A common approach to transfer learning for multiple tasks that maximize parameter sharing trains one or more task-specific layers on top of the language model. In this paper, we present an alternative approach based on adversarial reprogramming, which extends earlier work on automatic prompt generation. Adversarial reprogramming attempts to learn task-specific word embeddings that, when concatenated to the input text, instruct the language model to solve the specified task. Using up to 25K trainable parameters per task, this approach outperforms all existing methods with up to 25M trainable parameters on the public leaderboard of the GLUE benchmark. Our method, initialized with task-specific human-readable prompts, also works in a few-shot setting, outperforming GPT-3 on two SuperGLUE tasks with just 32 training samples.","bibtex":"@inproceedings{hambardzumyan-etal-2021-warp,\n title = \"{WARP}: {W}ord-level {A}dversarial {R}e{P}rogramming\",\n author = \"Hambardzumyan, Karen and\n Khachatrian, Hrant and\n May, Jonathan\",\n booktitle = \"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)\",\n month = aug,\n year = \"2021\",\n address = \"Online\",\n publisher = \"Association for Computational Linguistics\",\n url = \"https://aclanthology.org/2021.acl-long.381\",\n pages = \"4921--4933\",\n abstract = \"Transfer learning from pretrained language models recently became the dominant approach for solving many NLP tasks. A common approach to transfer learning for multiple tasks that maximize parameter sharing trains one or more task-specific layers on top of the language model. In this paper, we present an alternative approach based on adversarial reprogramming, which extends earlier work on automatic prompt generation. Adversarial reprogramming attempts to learn task-specific word embeddings that, when concatenated to the input text, instruct the language model to solve the specified task. Using up to 25K trainable parameters per task, this approach outperforms all existing methods with up to 25M trainable parameters on the public leaderboard of the GLUE benchmark. Our method, initialized with task-specific human-readable prompts, also works in a few-shot setting, outperforming GPT-3 on two SuperGLUE tasks with just 32 training samples.\",\n}\n\n","author_short":["Hambardzumyan, K.","Khachatrian, H.","May, J."],"key":"hambardzumyan-etal-2021-warp","id":"hambardzumyan-etal-2021-warp","bibbaseid":"hambardzumyan-khachatrian-may-warpwordleveladversarialreprogramming-2021","role":"author","urls":{"Paper":"https://aclanthology.org/2021.acl-long.381"},"metadata":{"authorlinks":{}}},"bibtype":"inproceedings","biburl":"https://jonmay.github.io/webpage/cutelabname/cutelabname.bib","dataSources":["ZdhKtP2cSp3Aki2ge","X5WBAKQabka5TW5z7","hbZSwot2msWk92m5B","fcWjcoAgajPvXWcp7","GvHfaAWP6AfN6oLQE","j3Qzx9HAAC6WtJDHS","5eM3sAccSEpjSDHHQ"],"keywords":[],"search_terms":["warp","word","level","adversarial","reprogramming","hambardzumyan","khachatrian","may"],"title":"WARP: Word-level Adversarial ReProgramming","year":2021}