Dealing with Metonymic Readings of Named Entities

Dealing with Metonymic Readings of Named Entities. Poibeau, T. ArXiv, 2006.

Dealing with Metonymic Readings of Named Entities Thierry Poibeau (thierry.poibeau@lipn.univ-paris13.fr) Laboratoire d’Informatique de Paris-Nord, Universite Paris 13 and UMR CNRS 7030 99, avenue Jean-Baptiste Clement – 93430 Villetaneuse – France detail our knowledge representation framework, allowing to dynamically compute the semantics of NE sequences from their immediate context. Lastly, we present an implementation and some experiments using the French ESTER corpus and showing significant improvements. Abstract The aim of this paper is to propose a method for tagging named entities (NE), using natural language processing techniques. Beyond their literal meaning, named entities are frequently subject to metonymy. We show the limits of current NE type hierarchies and detail a new proposal aiming at dynamically capturing the semantics of entities in context. This model can analyze complex linguistic phenomena like metonymy, which are known to be difficult for natural language processing but crucial for most applications. We present an implementation and some test using the French ESTER corpus and give significant results. Names, categorization and reference There is a kind of consensus on the fact that categorization and reference of linguistic expressions are related to discrete-continuous space interplay. Categorization is the ability to select parts of the environment and classify them as instances of concepts. The process of attention is then the ability to specifically focus on a part of the observation space that is relevant in a given context (Cruse and Croft, 2004). Selected parts of the observation space is said to be salient. Two important linguistic phenomena are based on a shift in the meaning profile of a word: the highlighting of its different facets and the phenomenon of metonymy (Nunberg, 1995) (Fass, 1997). A metonymy denotates a different concept than the “literal” denotation of a word, whereas the notion of facet only means focusing on a specific aspect of a concept (different parts of the meaning space of a word or “different ways of looking at the same thing”). However, both phenomena correspond to a semantic shift in interpretation (“profile shift”) that appear to be a function of salience (Cruse and Croft, 2004). In this section, we examine different theories concerning this topic, especially the model proposed by Pustejovsky (1995). We then discuss the case of NEs and examine previous work dealing with related questions using Natural Language Processing techniques. Keywords: Metonymy; Named Entities; Categoriza- tion; Semantics; Natural Language Processing. Introduction Categorization is a key question in science and philosophy at least since Aristotle. Many research efforts have been made on this issue in linguistics since text understanding and more generally, reasoning or inferring largely require a precise identification of objects referred to in discourse. Lexical semantics has attracted the major part of research related to these issues in linguistics in the last few years. What is the meaning of an expression? How does it change in context? These are still open questions. Many research projects have addressed the issue of proper name identification in newspaper texts, especially the Message Understanding Conferences (MUC-6, 1995). In these conferences, the first task to achieve is to identify named entities (NE), i.e. proper names, temporal and numerical expressions. This task is generally accomplished according to a pre-defined hierarchy of entity categories. The categorization process relies on the assumption that NEs directly refer to external objects and can thus be easily categorized. In this paper, we show that this assumption is an over-simplification of the problem: many entities are ambiguous and inter-annotator agreement is dramatically low for some categories. We assume that even if NE tagging achieves good performances (over .90 rate of combined precision and recall is frequent on journalistic corpora), NEs are intrinsically ambiguous and cause numerous categorization problems. We propose a new dynamic representation framework in which it is possible to specify the meaning of a NE from its context. In the paper, we report previous work on NE tagging. We then show different cases of polysemous entities in context and some considerations about their referential status. We Pustejovsky’s Generative lexicon (1995) Pustejovsky developed an interesting model for sense selection in context (1995). His proposal – the Generative Lexicon – is based on Davidson's logic model and a strict typed theory developed in Pustejovsky (1995) and more recently in Asher and Pustejovsky (1999). Words like book are called dot object : “dot” is a function enabling to encode two facets of a given word. A book is by default a physical object but some verbs like read or enjoy might activate specific features that coerce the initial type: book then no longer refers to a physical object but to its content (through its “telic role” encoded in a complex structured called the qualia structure). Moreover, complex operations related to the same process explain why John enjoyed his book is interpreted as an ellipsis and imply reading a book.

@article{poibeau_dealing_2006,
title = {Dealing with {Metonymic} {Readings} of {Named} {Entities}},
url = {https://arxiv.org/pdf/cs/0607052},
abstract = {Dealing with Metonymic Readings of Named Entities Thierry Poibeau (thierry.poibeau@lipn.univ-paris13.fr) Laboratoire d’Informatique de Paris-Nord, Universite Paris 13 and UMR CNRS 7030 99, avenue Jean-Baptiste Clement – 93430 Villetaneuse – France detail our knowledge representation framework, allowing to dynamically compute the semantics of NE sequences from their immediate context. Lastly, we present an implementation and some experiments using the French ESTER corpus and showing significant improvements. Abstract The aim of this paper is to propose a method for tagging named entities (NE), using natural language processing techniques. Beyond their literal meaning, named entities are frequently subject to metonymy. We show the limits of current NE type hierarchies and detail a new proposal aiming at dynamically capturing the semantics of entities in context. This model can analyze complex linguistic phenomena like metonymy, which are known to be difficult for natural language processing but crucial for most applications. We present an implementation and some test using the French ESTER corpus and give significant results. Names, categorization and reference There is a kind of consensus on the fact that categorization and reference of linguistic expressions are related to discrete-continuous space interplay. Categorization is the ability to select parts of the environment and classify them as instances of concepts. The process of attention is then the ability to specifically focus on a part of the observation space that is relevant in a given context (Cruse and Croft, 2004). Selected parts of the observation space is said to be salient. Two important linguistic phenomena are based on a shift in the meaning profile of a word: the highlighting of its different facets and the phenomenon of metonymy (Nunberg, 1995) (Fass, 1997). A metonymy denotates a different concept than the “literal” denotation of a word, whereas the notion of facet only means focusing on a specific aspect of a concept (different parts of the meaning space of a word or “different ways of looking at the same thing”). However, both phenomena correspond to a semantic shift in interpretation (“profile shift”) that appear to be a function of salience (Cruse and Croft, 2004). In this section, we examine different theories concerning this topic, especially the model proposed by Pustejovsky (1995). We then discuss the case of NEs and examine previous work dealing with related questions using Natural Language Processing techniques. Keywords: Metonymy; Named Entities; Categoriza- tion; Semantics; Natural Language Processing. Introduction Categorization is a key question in science and philosophy at least since Aristotle. Many research efforts have been made on this issue in linguistics since text understanding and more generally, reasoning or inferring largely require a precise identification of objects referred to in discourse. Lexical semantics has attracted the major part of research related to these issues in linguistics in the last few years. What is the meaning of an expression? How does it change in context? These are still open questions. Many research projects have addressed the issue of proper name identification in newspaper texts, especially the Message Understanding Conferences (MUC-6, 1995). In these conferences, the first task to achieve is to identify named entities (NE), i.e. proper names, temporal and numerical expressions. This task is generally accomplished according to a pre-defined hierarchy of entity categories. The categorization process relies on the assumption that NEs directly refer to external objects and can thus be easily categorized. In this paper, we show that this assumption is an over-simplification of the problem: many entities are ambiguous and inter-annotator agreement is dramatically low for some categories. We assume that even if NE tagging achieves good performances (over .90 rate of combined precision and recall is frequent on journalistic corpora), NEs are intrinsically ambiguous and cause numerous categorization problems. We propose a new dynamic representation framework in which it is possible to specify the meaning of a NE from its context. In the paper, we report previous work on NE tagging. We then show different cases of polysemous entities in context and some considerations about their referential status. We Pustejovsky’s Generative lexicon (1995) Pustejovsky developed an interesting model for sense selection in context (1995). His proposal – the Generative Lexicon – is based on Davidson's logic model and a strict typed theory developed in Pustejovsky (1995) and more recently in Asher and Pustejovsky (1999). Words like book are called dot object : “dot” is a function enabling to encode two facets of a given word. A book is by default a physical object but some verbs like read or enjoy might activate specific features that coerce the initial type: book then no longer refers to a physical object but to its content (through its “telic role” encoded in a complex structured called the qualia structure). Moreover, complex operations related to the same process explain why John enjoyed his book is interpreted as an ellipsis and imply reading a book.},
urldate = {2025-01-26},
journal = {ArXiv},
author = {Poibeau, Thierry},
year = {2006},
pages = {1--6},
}

Downloads: 0

{"_id":"BztK9Pc5YQpAguDDX","bibbaseid":"poibeau-dealingwithmetonymicreadingsofnamedentities-2006","author_short":["Poibeau, T."],"bibdata":{"bibtype":"article","type":"article","title":"Dealing with Metonymic Readings of Named Entities","url":"https://arxiv.org/pdf/cs/0607052","abstract":"Dealing with Metonymic Readings of Named Entities Thierry Poibeau (thierry.poibeau@lipn.univ-paris13.fr) Laboratoire d’Informatique de Paris-Nord, Universite Paris 13 and UMR CNRS 7030 99, avenue Jean-Baptiste Clement – 93430 Villetaneuse – France detail our knowledge representation framework, allowing to dynamically compute the semantics of NE sequences from their immediate context. Lastly, we present an implementation and some experiments using the French ESTER corpus and showing significant improvements. Abstract The aim of this paper is to propose a method for tagging named entities (NE), using natural language processing techniques. Beyond their literal meaning, named entities are frequently subject to metonymy. We show the limits of current NE type hierarchies and detail a new proposal aiming at dynamically capturing the semantics of entities in context. This model can analyze complex linguistic phenomena like metonymy, which are known to be difficult for natural language processing but crucial for most applications. We present an implementation and some test using the French ESTER corpus and give significant results. Names, categorization and reference There is a kind of consensus on the fact that categorization and reference of linguistic expressions are related to discrete-continuous space interplay. Categorization is the ability to select parts of the environment and classify them as instances of concepts. The process of attention is then the ability to specifically focus on a part of the observation space that is relevant in a given context (Cruse and Croft, 2004). Selected parts of the observation space is said to be salient. Two important linguistic phenomena are based on a shift in the meaning profile of a word: the highlighting of its different facets and the phenomenon of metonymy (Nunberg, 1995) (Fass, 1997). A metonymy denotates a different concept than the “literal” denotation of a word, whereas the notion of facet only means focusing on a specific aspect of a concept (different parts of the meaning space of a word or “different ways of looking at the same thing”). However, both phenomena correspond to a semantic shift in interpretation (“profile shift”) that appear to be a function of salience (Cruse and Croft, 2004). In this section, we examine different theories concerning this topic, especially the model proposed by Pustejovsky (1995). We then discuss the case of NEs and examine previous work dealing with related questions using Natural Language Processing techniques. Keywords: Metonymy; Named Entities; Categoriza- tion; Semantics; Natural Language Processing. Introduction Categorization is a key question in science and philosophy at least since Aristotle. Many research efforts have been made on this issue in linguistics since text understanding and more generally, reasoning or inferring largely require a precise identification of objects referred to in discourse. Lexical semantics has attracted the major part of research related to these issues in linguistics in the last few years. What is the meaning of an expression? How does it change in context? These are still open questions. Many research projects have addressed the issue of proper name identification in newspaper texts, especially the Message Understanding Conferences (MUC-6, 1995). In these conferences, the first task to achieve is to identify named entities (NE), i.e. proper names, temporal and numerical expressions. This task is generally accomplished according to a pre-defined hierarchy of entity categories. The categorization process relies on the assumption that NEs directly refer to external objects and can thus be easily categorized. In this paper, we show that this assumption is an over-simplification of the problem: many entities are ambiguous and inter-annotator agreement is dramatically low for some categories. We assume that even if NE tagging achieves good performances (over .90 rate of combined precision and recall is frequent on journalistic corpora), NEs are intrinsically ambiguous and cause numerous categorization problems. We propose a new dynamic representation framework in which it is possible to specify the meaning of a NE from its context. In the paper, we report previous work on NE tagging. We then show different cases of polysemous entities in context and some considerations about their referential status. We Pustejovsky’s Generative lexicon (1995) Pustejovsky developed an interesting model for sense selection in context (1995). His proposal – the Generative Lexicon – is based on Davidson's logic model and a strict typed theory developed in Pustejovsky (1995) and more recently in Asher and Pustejovsky (1999). Words like book are called dot object : “dot” is a function enabling to encode two facets of a given word. A book is by default a physical object but some verbs like read or enjoy might activate specific features that coerce the initial type: book then no longer refers to a physical object but to its content (through its “telic role” encoded in a complex structured called the qualia structure). Moreover, complex operations related to the same process explain why John enjoyed his book is interpreted as an ellipsis and imply reading a book.","urldate":"2025-01-26","journal":"ArXiv","author":[{"propositions":[],"lastnames":["Poibeau"],"firstnames":["Thierry"],"suffixes":[]}],"year":"2006","pages":"1–6","bibtex":"@article{poibeau_dealing_2006,\n\ttitle = {Dealing with {Metonymic} {Readings} of {Named} {Entities}},\n\turl = {https://arxiv.org/pdf/cs/0607052},\n\tabstract = {Dealing with Metonymic Readings of Named Entities Thierry Poibeau (thierry.poibeau@lipn.univ-paris13.fr) Laboratoire d’Informatique de Paris-Nord, Universite Paris 13 and UMR CNRS 7030 99, avenue Jean-Baptiste Clement – 93430 Villetaneuse – France detail our knowledge representation framework, allowing to dynamically compute the semantics of NE sequences from their immediate context. Lastly, we present an implementation and some experiments using the French ESTER corpus and showing significant improvements. Abstract The aim of this paper is to propose a method for tagging named entities (NE), using natural language processing techniques. Beyond their literal meaning, named entities are frequently subject to metonymy. We show the limits of current NE type hierarchies and detail a new proposal aiming at dynamically capturing the semantics of entities in context. This model can analyze complex linguistic phenomena like metonymy, which are known to be difficult for natural language processing but crucial for most applications. We present an implementation and some test using the French ESTER corpus and give significant results. Names, categorization and reference There is a kind of consensus on the fact that categorization and reference of linguistic expressions are related to discrete-continuous space interplay. Categorization is the ability to select parts of the environment and classify them as instances of concepts. The process of attention is then the ability to specifically focus on a part of the observation space that is relevant in a given context (Cruse and Croft, 2004). Selected parts of the observation space is said to be salient. Two important linguistic phenomena are based on a shift in the meaning profile of a word: the highlighting of its different facets and the phenomenon of metonymy (Nunberg, 1995) (Fass, 1997). A metonymy denotates a different concept than the “literal” denotation of a word, whereas the notion of facet only means focusing on a specific aspect of a concept (different parts of the meaning space of a word or “different ways of looking at the same thing”). However, both phenomena correspond to a semantic shift in interpretation (“profile shift”) that appear to be a function of salience (Cruse and Croft, 2004). In this section, we examine different theories concerning this topic, especially the model proposed by Pustejovsky (1995). We then discuss the case of NEs and examine previous work dealing with related questions using Natural Language Processing techniques. Keywords: Metonymy; Named Entities; Categoriza- tion; Semantics; Natural Language Processing. Introduction Categorization is a key question in science and philosophy at least since Aristotle. Many research efforts have been made on this issue in linguistics since text understanding and more generally, reasoning or inferring largely require a precise identification of objects referred to in discourse. Lexical semantics has attracted the major part of research related to these issues in linguistics in the last few years. What is the meaning of an expression? How does it change in context? These are still open questions. Many research projects have addressed the issue of proper name identification in newspaper texts, especially the Message Understanding Conferences (MUC-6, 1995). In these conferences, the first task to achieve is to identify named entities (NE), i.e. proper names, temporal and numerical expressions. This task is generally accomplished according to a pre-defined hierarchy of entity categories. The categorization process relies on the assumption that NEs directly refer to external objects and can thus be easily categorized. In this paper, we show that this assumption is an over-simplification of the problem: many entities are ambiguous and inter-annotator agreement is dramatically low for some categories. We assume that even if NE tagging achieves good performances (over .90 rate of combined precision and recall is frequent on journalistic corpora), NEs are intrinsically ambiguous and cause numerous categorization problems. We propose a new dynamic representation framework in which it is possible to specify the meaning of a NE from its context. In the paper, we report previous work on NE tagging. We then show different cases of polysemous entities in context and some considerations about their referential status. We Pustejovsky’s Generative lexicon (1995) Pustejovsky developed an interesting model for sense selection in context (1995). His proposal – the Generative Lexicon – is based on Davidson's logic model and a strict typed theory developed in Pustejovsky (1995) and more recently in Asher and Pustejovsky (1999). Words like book are called dot object : “dot” is a function enabling to encode two facets of a given word. A book is by default a physical object but some verbs like read or enjoy might activate specific features that coerce the initial type: book then no longer refers to a physical object but to its content (through its “telic role” encoded in a complex structured called the qualia structure). Moreover, complex operations related to the same process explain why John enjoyed his book is interpreted as an ellipsis and imply reading a book.},\n\turldate = {2025-01-26},\n\tjournal = {ArXiv},\n\tauthor = {Poibeau, Thierry},\n\tyear = {2006},\n\tpages = {1--6},\n}\n\n\n\n","author_short":["Poibeau, T."],"key":"poibeau_dealing_2006","id":"poibeau_dealing_2006","bibbaseid":"poibeau-dealingwithmetonymicreadingsofnamedentities-2006","role":"author","urls":{"Paper":"https://arxiv.org/pdf/cs/0607052"},"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"https://bibbase.org/zotero-group/schulzkx/5158478","dataSources":["JFDnASMkoQCjjGL8E"],"keywords":[],"search_terms":["dealing","metonymic","readings","named","entities","poibeau"],"title":"Dealing with Metonymic Readings of Named Entities","year":2006}