Mixed-depth representations for natural language text. Hirst, G. & Ryan, M. Text-based intelligent systems, pages 59–82. Lawrence Erlbaum Associates, Hillsdale, NJ, 1992. abstract bibtex Intelligent text-based systems will vary as to the degree of difficulty of the texts they deal with. Some may have a relatively easy time with texts for which fairly superficial processes will get useful results, such as, say, The New York Times or Julia Child's Favorite Recipes. But many systems will have to work on more difficult texts. Often, it is the complexity of the text that makes the system desirable in the first place. It is for such systems that we need to think about making the deeper methods that are already studied in AI and computational linguistics more robust and suitable for processing long texts without interactive human help. The dilemma is that on one hand, we have the limitations of raw text databases and superficial processing methods; on the other we have the difficulty of deeper methods and conceptual representations. Our proposal here is to have the best of both, and accordingly we develop the notion of a heterogeneous, or mixed, type of representation.
In our model, a text base permits two parallel representations of meaning: the text itself, for presentation to human users, and a conceptual encoding of the text, for use by intelligent components of the system. The two representations are stored in parallel; that is, there are links between each unit of text (a sentence or paragraph in most cases) and the corresponding conceptual encoding. This encoding could be created en masse when the text was entered into the system. But if it is expected that only a small fraction of the text base will ever be looked at by processes that need the conceptual representations, then the encoding could be performed on each part of the text as necessary for inference and understanding to answer some particular request. The results could then be stored so that they don't have to be redone if the same area of the text is searched again. Thus, a text would gradually grow its encoding as it continues to be used. (And the work will never be done for texts or parts of texts that are never used.)
So far, this is straightforward. But we can go one step further. The encoding itself may be deep or shallow at different places, depending on what happened to be necessary at the time it was generated—or on what was possible. Or, to put it a different way, we can view natural-language text and AI-style knowledge representations as two ends of a spectrum.
@InBook{ hirst22,
author = {Graeme Hirst and Mark Ryan},
chapter = {Mixed-depth representations for natural language text},
editor = {Paul S. Jacobs},
title = {Text-based intelligent systems},
address = {Hillsdale, NJ},
publisher = {Lawrence Erlbaum Associates},
year = {1992},
pages = {59--82},
abstract = {<P> Intelligent text-based systems will vary as to the
degree of difficulty of the texts they deal with. Some may
have a relatively easy time with texts for which fairly
superficial processes will get useful results, such as,
say, <I>The New York Times</I> or <I>Julia Child's Favorite
Recipes</I>. But many systems will have to work on more
difficult texts. Often, it is the complexity of the text
that makes the system desirable in the first place. It is
for such systems that we need to think about making the
deeper methods that are already studied in AI and
computational linguistics more robust and suitable for
processing long texts without interactive human help. The
dilemma is that on one hand, we have the limitations of raw
text databases and superficial processing methods; on the
other we have the difficulty of deeper methods and
conceptual representations. Our proposal here is to have
the best of both, and accordingly we develop the notion of
a heterogeneous, or mixed, type of representation.</p>
<P>In our model, a text base permits two parallel
representations of meaning: the text itself, for
presentation to human users, and a <I>conceptual
encoding</I> of the text, for use by intelligent components
of the system. The two representations are stored in
parallel; that is, there are links between each unit of
text (a sentence or paragraph in most cases) and the
corresponding conceptual encoding. This encoding could be
created en masse when the text was entered into the system.
But if it is expected that only a small fraction of the
text base will ever be looked at by processes that need the
conceptual representations, then the encoding could be
performed on each part of the text as necessary for
inference and understanding to answer some particular
request. The results could then be stored so that they
don't have to be redone if the same area of the text is
searched again. Thus, a text would gradually <I>grow</I>
its encoding as it continues to be used. (And the work will
never be done for texts or parts of texts that are never
used.)</p> <p>So far, this is straightforward. But we can
go one step further. The encoding itself may be deep or
shallow at different places, depending on what happened to
be necessary at the time it was generated---or on what was
possible. Or, to put it a different way, we can view
natural-language text and AI-style knowledge
representations as two ends of a spectrum.</p>},
download = {http://ftp.cs.toronto.edu/pub/gh/Hirst+Ryan-92.pdf}
}
Downloads: 0
{"_id":{"_str":"521afb58aa2f288d1f000b11"},"__v":2,"authorIDs":[],"author_short":["Hirst, G.","Ryan, M."],"bibbaseid":"hirst-ryan-textbasedintelligentsystems-1992","bibdata":{"bibtype":"inbook","type":"inbook","author":[{"firstnames":["Graeme"],"propositions":[],"lastnames":["Hirst"],"suffixes":[]},{"firstnames":["Mark"],"propositions":[],"lastnames":["Ryan"],"suffixes":[]}],"chapter":"Mixed-depth representations for natural language text","editor":[{"firstnames":["Paul","S."],"propositions":[],"lastnames":["Jacobs"],"suffixes":[]}],"title":"Text-based intelligent systems","address":"Hillsdale, NJ","publisher":"Lawrence Erlbaum Associates","year":"1992","pages":"59–82","abstract":"<P> Intelligent text-based systems will vary as to the degree of difficulty of the texts they deal with. Some may have a relatively easy time with texts for which fairly superficial processes will get useful results, such as, say, <I>The New York Times</I> or <I>Julia Child's Favorite Recipes</I>. But many systems will have to work on more difficult texts. Often, it is the complexity of the text that makes the system desirable in the first place. It is for such systems that we need to think about making the deeper methods that are already studied in AI and computational linguistics more robust and suitable for processing long texts without interactive human help. The dilemma is that on one hand, we have the limitations of raw text databases and superficial processing methods; on the other we have the difficulty of deeper methods and conceptual representations. Our proposal here is to have the best of both, and accordingly we develop the notion of a heterogeneous, or mixed, type of representation.</p> <P>In our model, a text base permits two parallel representations of meaning: the text itself, for presentation to human users, and a <I>conceptual encoding</I> of the text, for use by intelligent components of the system. The two representations are stored in parallel; that is, there are links between each unit of text (a sentence or paragraph in most cases) and the corresponding conceptual encoding. This encoding could be created en masse when the text was entered into the system. But if it is expected that only a small fraction of the text base will ever be looked at by processes that need the conceptual representations, then the encoding could be performed on each part of the text as necessary for inference and understanding to answer some particular request. The results could then be stored so that they don't have to be redone if the same area of the text is searched again. Thus, a text would gradually <I>grow</I> its encoding as it continues to be used. (And the work will never be done for texts or parts of texts that are never used.)</p> <p>So far, this is straightforward. But we can go one step further. The encoding itself may be deep or shallow at different places, depending on what happened to be necessary at the time it was generated—or on what was possible. Or, to put it a different way, we can view natural-language text and AI-style knowledge representations as two ends of a spectrum.</p>","download":"http://ftp.cs.toronto.edu/pub/gh/Hirst+Ryan-92.pdf","bibtex":"@InBook{\t hirst22,\n author\t= {Graeme Hirst and Mark Ryan},\n chapter\t= {Mixed-depth representations for natural language text},\n editor\t= {Paul S. Jacobs},\n title\t\t= {Text-based intelligent systems},\n address\t= {Hillsdale, NJ},\n publisher\t= {Lawrence Erlbaum Associates},\n year\t\t= {1992},\n pages\t\t= {59--82},\n abstract\t= {<P> Intelligent text-based systems will vary as to the\n\t\t degree of difficulty of the texts they deal with. Some may\n\t\t have a relatively easy time with texts for which fairly\n\t\t superficial processes will get useful results, such as,\n\t\t say, <I>The New York Times</I> or <I>Julia Child's Favorite\n\t\t Recipes</I>. But many systems will have to work on more\n\t\t difficult texts. Often, it is the complexity of the text\n\t\t that makes the system desirable in the first place. It is\n\t\t for such systems that we need to think about making the\n\t\t deeper methods that are already studied in AI and\n\t\t computational linguistics more robust and suitable for\n\t\t processing long texts without interactive human help. The\n\t\t dilemma is that on one hand, we have the limitations of raw\n\t\t text databases and superficial processing methods; on the\n\t\t other we have the difficulty of deeper methods and\n\t\t conceptual representations. Our proposal here is to have\n\t\t the best of both, and accordingly we develop the notion of\n\t\t a heterogeneous, or mixed, type of representation.</p>\n\t\t <P>In our model, a text base permits two parallel\n\t\t representations of meaning: the text itself, for\n\t\t presentation to human users, and a <I>conceptual\n\t\t encoding</I> of the text, for use by intelligent components\n\t\t of the system. The two representations are stored in\n\t\t parallel; that is, there are links between each unit of\n\t\t text (a sentence or paragraph in most cases) and the\n\t\t corresponding conceptual encoding. This encoding could be\n\t\t created en masse when the text was entered into the system.\n\t\t But if it is expected that only a small fraction of the\n\t\t text base will ever be looked at by processes that need the\n\t\t conceptual representations, then the encoding could be\n\t\t performed on each part of the text as necessary for\n\t\t inference and understanding to answer some particular\n\t\t request. The results could then be stored so that they\n\t\t don't have to be redone if the same area of the text is\n\t\t searched again. Thus, a text would gradually <I>grow</I>\n\t\t its encoding as it continues to be used. (And the work will\n\t\t never be done for texts or parts of texts that are never\n\t\t used.)</p> <p>So far, this is straightforward. But we can\n\t\t go one step further. The encoding itself may be deep or\n\t\t shallow at different places, depending on what happened to\n\t\t be necessary at the time it was generated---or on what was\n\t\t possible. Or, to put it a different way, we can view\n\t\t natural-language text and AI-style knowledge\n\t\t representations as two ends of a spectrum.</p>},\n download\t= {http://ftp.cs.toronto.edu/pub/gh/Hirst+Ryan-92.pdf}\n}\n\n","author_short":["Hirst, G.","Ryan, M."],"editor_short":["Jacobs, P. S."],"key":"hirst22","id":"hirst22","bibbaseid":"hirst-ryan-textbasedintelligentsystems-1992","role":"author","urls":{},"metadata":{"authorlinks":{}}},"bibtype":"inbook","biburl":"www.cs.toronto.edu/~fritz/tmp/compling.bib","downloads":0,"keywords":[],"search_terms":["text","based","intelligent","systems","hirst","ryan"],"title":"Text-based intelligent systems","title_words":["text","based","intelligent","systems"],"year":1992,"dataSources":["n8jB5BJxaeSmH6mtR","6b6A9kbkw4CsEGnRX"]}