The rhetorical parsing of unrestricted texts: A surface-based approach. Marcu, D. Computational Linguistics, 26(3):395–448, September, 2000. abstract bibtex Coherent texts are not just simple sequences of clauses and sentences, but rather complex artifacts that have highly elaborate rhetorical structure. This paper explores the extent to which well-formed rhetorical structures can be automatically derived by means of surface-form-based algorithms. These algorithms identify discourse usages of cue phrases and break sentences into clauses, hypothesize rhetorical relations that hold among textual units, and produce valid rhetorical structure trees for unrestricted natural language texts. The algorithms are empirically grounded in a corpus analysis of cue phrases and rely on a first-order formalization of rhetorical structure trees.
The algorithms are evaluated both intrinsically and extrinsically. The intrinsic evaluation assesses the resemblance between automatically and manually constructed rhetorical structure trees. The extrinsic evaluation shows that automatically derived rhetorical structures can be successfully exploited in the context of text summarization.
@Article{ marcu3,
author = {Daniel Marcu},
title = {The rhetorical parsing of unrestricted texts: A
surface-based approach},
journal = {Computational Linguistics},
volume = {26},
number = {3},
month = {September},
year = {2000},
pages = {395--448},
abstract = {<P> Coherent texts are not just simple sequences of
clauses and sentences, but rather complex artifacts that
have highly elaborate rhetorical structure. This paper
explores the extent to which well-formed rhetorical
structures can be automatically derived by means of
surface-form-based algorithms. These algorithms identify
discourse usages of cue phrases and break sentences into
clauses, hypothesize rhetorical relations that hold among
textual units, and produce valid rhetorical structure trees
for unrestricted natural language texts. The algorithms are
empirically grounded in a corpus analysis of cue phrases
and rely on a first-order formalization of rhetorical
structure trees.</p> <P> The algorithms are evaluated both
intrinsically and extrinsically. The intrinsic evaluation
assesses the resemblance between automatically and manually
constructed rhetorical structure trees. The extrinsic
evaluation shows that automatically derived rhetorical
structures can be successfully exploited in the context of
text summarization.</p>},
download = {http://ftp.cs.toronto.edu/pub/gh/Marcu-2000c.pdf}
}
Downloads: 0
{"_id":{"_str":"534282740e946d920a001b2d"},"__v":3,"authorIDs":["5456f4cc8b01c819300000ad"],"author_short":["Marcu, D."],"bibbaseid":"marcu-therhetoricalparsingofunrestrictedtextsasurfacebasedapproach-2000","bibdata":{"bibtype":"article","type":"article","author":[{"firstnames":["Daniel"],"propositions":[],"lastnames":["Marcu"],"suffixes":[]}],"title":"The rhetorical parsing of unrestricted texts: A surface-based approach","journal":"Computational Linguistics","volume":"26","number":"3","month":"September","year":"2000","pages":"395–448","abstract":"<P> Coherent texts are not just simple sequences of clauses and sentences, but rather complex artifacts that have highly elaborate rhetorical structure. This paper explores the extent to which well-formed rhetorical structures can be automatically derived by means of surface-form-based algorithms. These algorithms identify discourse usages of cue phrases and break sentences into clauses, hypothesize rhetorical relations that hold among textual units, and produce valid rhetorical structure trees for unrestricted natural language texts. The algorithms are empirically grounded in a corpus analysis of cue phrases and rely on a first-order formalization of rhetorical structure trees.</p> <P> The algorithms are evaluated both intrinsically and extrinsically. The intrinsic evaluation assesses the resemblance between automatically and manually constructed rhetorical structure trees. The extrinsic evaluation shows that automatically derived rhetorical structures can be successfully exploited in the context of text summarization.</p>","download":"http://ftp.cs.toronto.edu/pub/gh/Marcu-2000c.pdf","bibtex":"@Article{\t marcu3,\n author\t= {Daniel Marcu},\n title\t\t= {The rhetorical parsing of unrestricted texts: A\n\t\t surface-based approach},\n journal\t= {Computational Linguistics},\n volume\t= {26},\n number\t= {3},\n month\t\t= {September},\n year\t\t= {2000},\n pages\t\t= {395--448},\n abstract\t= {<P> Coherent texts are not just simple sequences of\n\t\t clauses and sentences, but rather complex artifacts that\n\t\t have highly elaborate rhetorical structure. This paper\n\t\t explores the extent to which well-formed rhetorical\n\t\t structures can be automatically derived by means of\n\t\t surface-form-based algorithms. These algorithms identify\n\t\t discourse usages of cue phrases and break sentences into\n\t\t clauses, hypothesize rhetorical relations that hold among\n\t\t textual units, and produce valid rhetorical structure trees\n\t\t for unrestricted natural language texts. The algorithms are\n\t\t empirically grounded in a corpus analysis of cue phrases\n\t\t and rely on a first-order formalization of rhetorical\n\t\t structure trees.</p> <P> The algorithms are evaluated both\n\t\t intrinsically and extrinsically. The intrinsic evaluation\n\t\t assesses the resemblance between automatically and manually\n\t\t constructed rhetorical structure trees. The extrinsic\n\t\t evaluation shows that automatically derived rhetorical\n\t\t structures can be successfully exploited in the context of\n\t\t text summarization.</p>},\n download\t= {http://ftp.cs.toronto.edu/pub/gh/Marcu-2000c.pdf}\n}\n\n","author_short":["Marcu, D."],"key":"marcu3","id":"marcu3","bibbaseid":"marcu-therhetoricalparsingofunrestrictedtextsasurfacebasedapproach-2000","role":"author","urls":{},"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"www.cs.toronto.edu/~fritz/tmp/compling.bib","downloads":0,"keywords":[],"search_terms":["rhetorical","parsing","unrestricted","texts","surface","based","approach","marcu"],"title":"The rhetorical parsing of unrestricted texts: A surface-based approach","year":2000,"dataSources":["n8jB5BJxaeSmH6mtR","6b6A9kbkw4CsEGnRX"]}