Automatic Detection of Authorship Changes within Single Documents. Graham, N. Master's thesis, Department of Computer Science, University of Toronto, January, 2000. Published as technical report CSRG-406abstract bibtex One of the most difficult tasks facing anyone who must compile or maintain any large, collaboratively-written document is to foster a consistent style throughout. In this thesis, we explore whether it is possible to identify stylistic inconsistencies within documents even in principle, given our understanding of how style can be captured statistically.
We carry out this investigation by computing stylistic statistics on very small samples of text comprising a set of synthetic collaboratively-written documents, and using these statistics to train and test a series of neural networks. We are able to show that this method does allow us to recover the boundaries of authors' contributions. We find that time-delay neural networks, hitherto ignored in this field, are especially effective in this regard. Along the way, we observe that statistics characterizing the syntactic style of a passage appear to hold much more information for small text samples than those concerned with lexical choice or complexity.
@MastersThesis{ graham3,
author = {Neil Graham},
title = {Automatic Detection of Authorship Changes within Single
Documents},
school = {Department of Computer Science, University of Toronto},
month = {January},
year = {2000},
note = {Published as technical report CSRG-406},
abstract = {<P> One of the most difficult tasks facing anyone who must
compile or maintain any large, collaboratively-written
document is to foster a consistent style throughout. In
this thesis, we explore whether it is possible to identify
stylistic inconsistencies within documents even in
principle, given our understanding of how style can be
captured statistically.</p> <P>We carry out this
investigation by computing stylistic statistics on very
small samples of text comprising a set of synthetic
collaboratively-written documents, and using these
statistics to train and test a series of neural networks.
We are able to show that this method does allow us to
recover the boundaries of authors' contributions. We find
that time-delay neural networks, hitherto ignored in this
field, are especially effective in this regard. Along the
way, we observe that statistics characterizing the
syntactic style of a passage appear to hold much more
information for small text samples than those concerned
with lexical choice or complexity.</p>},
download = {http://ftp.cs.toronto.edu/pub/gh/Graham-thesis.pdf}
}
Downloads: 0
{"_id":{"_str":"534282740e946d920a001b2e"},"__v":4,"authorIDs":["545f0aed6aaec20d23000a82"],"author_short":["Graham, N."],"bibbaseid":"graham-automaticdetectionofauthorshipchangeswithinsingledocuments-2000","bibdata":{"bibtype":"mastersthesis","type":"mastersthesis","author":[{"firstnames":["Neil"],"propositions":[],"lastnames":["Graham"],"suffixes":[]}],"title":"Automatic Detection of Authorship Changes within Single Documents","school":"Department of Computer Science, University of Toronto","month":"January","year":"2000","note":"Published as technical report CSRG-406","abstract":"<P> One of the most difficult tasks facing anyone who must compile or maintain any large, collaboratively-written document is to foster a consistent style throughout. In this thesis, we explore whether it is possible to identify stylistic inconsistencies within documents even in principle, given our understanding of how style can be captured statistically.</p> <P>We carry out this investigation by computing stylistic statistics on very small samples of text comprising a set of synthetic collaboratively-written documents, and using these statistics to train and test a series of neural networks. We are able to show that this method does allow us to recover the boundaries of authors' contributions. We find that time-delay neural networks, hitherto ignored in this field, are especially effective in this regard. Along the way, we observe that statistics characterizing the syntactic style of a passage appear to hold much more information for small text samples than those concerned with lexical choice or complexity.</p>","download":"http://ftp.cs.toronto.edu/pub/gh/Graham-thesis.pdf","bibtex":"@MastersThesis{\t graham3,\n author\t= {Neil Graham},\n title\t\t= {Automatic Detection of Authorship Changes within Single\n\t\t Documents},\n school\t= {Department of Computer Science, University of Toronto},\n month\t\t= {January},\n year\t\t= {2000},\n note\t\t= {Published as technical report CSRG-406},\n abstract\t= {<P> One of the most difficult tasks facing anyone who must\n\t\t compile or maintain any large, collaboratively-written\n\t\t document is to foster a consistent style throughout. In\n\t\t this thesis, we explore whether it is possible to identify\n\t\t stylistic inconsistencies within documents even in\n\t\t principle, given our understanding of how style can be\n\t\t captured statistically.</p> <P>We carry out this\n\t\t investigation by computing stylistic statistics on very\n\t\t small samples of text comprising a set of synthetic\n\t\t collaboratively-written documents, and using these\n\t\t statistics to train and test a series of neural networks.\n\t\t We are able to show that this method does allow us to\n\t\t recover the boundaries of authors' contributions. We find\n\t\t that time-delay neural networks, hitherto ignored in this\n\t\t field, are especially effective in this regard. Along the\n\t\t way, we observe that statistics characterizing the\n\t\t syntactic style of a passage appear to hold much more\n\t\t information for small text samples than those concerned\n\t\t with lexical choice or complexity.</p>},\n download\t= {http://ftp.cs.toronto.edu/pub/gh/Graham-thesis.pdf}\n}\n\n","author_short":["Graham, N."],"key":"graham3","id":"graham3","bibbaseid":"graham-automaticdetectionofauthorshipchangeswithinsingledocuments-2000","role":"author","urls":{},"metadata":{"authorlinks":{}}},"bibtype":"mastersthesis","biburl":"www.cs.toronto.edu/~fritz/tmp/compling.bib","downloads":0,"keywords":[],"search_terms":["automatic","detection","authorship","changes","within","single","documents","graham"],"title":"Automatic Detection of Authorship Changes within Single Documents","year":2000,"dataSources":["n8jB5BJxaeSmH6mtR","6b6A9kbkw4CsEGnRX"]}