A disfluency study for cleaning spontaneous speech automatic transcripts and improving speech language models

A disfluency study for cleaning spontaneous speech automatic transcripts and improving speech language models. Adda-Decker, M., Habert, B., Barras, C., Adda, G., Boula de Mareüil, P., & Paroubek, P. In DiSS 2003. Proceedings of the ISCA Tutorial and Research Workshop Disfluency in Spontaneous Speech, pages 67–70, 2003.

Paper abstract bibtex

The aim of this study is to elaborate a disfluent speech model by comparing different types of audio transcripts. The study makes use of 10 hours of French radio interview archives, involving journalists and personalities from political or civil society. A first type of transcripts is press-oriented where most disfluencies are discarded. For 10% of the corpus, we produced exact audio transcripts: all audible phenomena and overlapping speech segments are transcribed manually. In these iranscripts about 14% of the words correspond to disfluencies and discourse markers. The audio corpus has then been iranscribed using the LIMSI speech recognizer. With 8% of the corpus the disfluency words explain 12% of the overall error rate. This shows that disfluencies have no major effect on neighboring speech segments. Restarts are the most error prone, with a 36.9% within class error rate.

@inproceedings{adda-decker_disfluency_2003,
	Author = {Adda-Decker, Martine and Habert, Benoît and Barras, Claude and Adda, Gilles and Boula de Mareüil, Philippe and Paroubek, Patrick},
	Booktitle = {DiSS 2003. Proceedings of the ISCA Tutorial and Research Workshop Disfluency in Spontaneous Speech},
	Date = {2003},
	Date-Modified = {2018-05-13 21:46:06 +0000},
	Editor = {Eklund, Robert},
	Eventdate = {2003-09-05/2003-09-08},
	Keywords = {conversation, disfluencies, French, mass media, phonetics, radio, restarts, speaking styles, speech recognition, speech technology, spontaneous speech, transcription},
	Location = {Göteborg, Sweden},
	Pages = {67--70},
	Title = {A disfluency study for cleaning spontaneous speech automatic transcripts and improving speech language models},
	Url = {http://www.isca-speech.org/archive_open/diss_03/dis3_067.html},
	Year = {2003},
	Abstract = {The aim of this study is to elaborate a disfluent speech model by comparing different types of audio transcripts. The study makes use of 10 hours of French radio interview archives, involving journalists and personalities from political or civil society. A first type of transcripts is press-oriented where most disfluencies are discarded. For 10\% of the corpus, we produced exact audio transcripts: all audible phenomena and overlapping speech segments are transcribed manually. In these iranscripts about 14\% of the words correspond to disfluencies and discourse markers. The audio corpus has then been iranscribed using the LIMSI speech recognizer. With 8\% of the corpus the disfluency words explain 12\% of the overall error rate. This shows that disfluencies have no major effect on neighboring speech segments. Restarts are the most error prone, with a 36.9\% within class error rate.},
	Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGJCVYJHZlcnNpb25YJG9iamVjdHNZJGFyY2hpdmVyVCR0b3ASAAGGoKgHCBMUFRYaIVUkbnVsbNMJCgsMDxJXTlMua2V5c1pOUy5vYmplY3RzViRjbGFzc6INDoACgAOiEBGABIAFgAdccmVsYXRpdmVQYXRoWWFsaWFzRGF0YV8QXy4uLy4uLy4uL0JpYmxpb2dyYWZpYS9QYXBlcnMvQWRkYS1EZWNrZXIvQSBkaXNmbHVlbmN5IHN0dWR5IGZvciBjbGVhbmluZyBzcG9udGFuZW91cyBzcGVlY2gucGRm0hcLGBlXTlMuZGF0YU8RAlYAAAAAAlYAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAMv2H85IKwAAEIZlpx9BIGRpc2ZsdWVuY3kgc3R1ZHkjMTA4NjY1QTgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQhmWo1AnS5QAAAAAAAAAAAAMABAAACSAAAAAAAAAAAAAAAAAAAAALQWRkYS1EZWNrZXIAABAACAAAy/YDrgAAABEACAAA1Am2xQAAAAEAFBCGZacQhmWOAAX8RwAF+5gAAMBGAAIAaU1hY2ludG9zaCBIRDpVc2VyczoAam9hcXVpbV9sbGlzdGVycmk6AEJpYmxpb2dyYWZpYToAUGFwZXJzOgBBZGRhLURlY2tlcjoAQSBkaXNmbHVlbmN5IHN0dWR5IzEwODY2NUE4LnBkZgAADgBuADYAQQAgAGQAaQBzAGYAbAB1AGUAbgBjAHkAIABzAHQAdQBkAHkAIABmAG8AcgAgAGMAbABlAGEAbgBpAG4AZwAgAHMAcABvAG4AdABhAG4AZQBvAHUAcwAgAHMAcABlAGUAYwBoAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgBuVXNlcnMvam9hcXVpbV9sbGlzdGVycmkvQmlibGlvZ3JhZmlhL1BhcGVycy9BZGRhLURlY2tlci9BIGRpc2ZsdWVuY3kgc3R1ZHkgZm9yIGNsZWFuaW5nIHNwb250YW5lb3VzIHNwZWVjaC5wZGYAEwABLwAAFQACABj//wAAgAbSGxwdHlokY2xhc3NuYW1lWCRjbGFzc2VzXU5TTXV0YWJsZURhdGGjHR8gVk5TRGF0YVhOU09iamVjdNIbHCIjXE5TRGljdGlvbmFyeaIiIF8QD05TS2V5ZWRBcmNoaXZlctEmJ1Ryb290gAEACAARABoAIwAtADIANwBAAEYATQBVAGAAZwBqAGwAbgBxAHMAdQB3AIQAjgDwAPUA/QNXA1kDXgNpA3IDgAOEA4sDlAOZA6YDqQO7A74DwwAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAPF},
	Bdsk-Url-1 = {http://www.isca-speech.org/archive_open/diss_03/dis3_067.html}}

Downloads: 0

{"_id":"CMsknYGM4Lyf4R8XL","bibbaseid":"addadecker-habert-barras-adda-boulademareil-paroubek-adisfluencystudyforcleaningspontaneousspeechautomatictranscriptsandimprovingspeechlanguagemodels-2003","downloads":0,"creationDate":"2016-09-21T09:08:37.867Z","title":"A disfluency study for cleaning spontaneous speech automatic transcripts and improving speech language models","author_short":["Adda-Decker, M.","Habert, B.","Barras, C.","Adda, G.","Boula de Mareüil, P.","Paroubek, P."],"year":2003,"bibtype":"inproceedings","biburl":"https://joaquimllisterri.cat/phonetics/ESTIVOZ/ESTIVOZ.bib","bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"propositions":[],"lastnames":["Adda-Decker"],"firstnames":["Martine"],"suffixes":[]},{"propositions":[],"lastnames":["Habert"],"firstnames":["Benoît"],"suffixes":[]},{"propositions":[],"lastnames":["Barras"],"firstnames":["Claude"],"suffixes":[]},{"propositions":[],"lastnames":["Adda"],"firstnames":["Gilles"],"suffixes":[]},{"propositions":["Boula","de"],"lastnames":["Mareüil"],"firstnames":["Philippe"],"suffixes":[]},{"propositions":[],"lastnames":["Paroubek"],"firstnames":["Patrick"],"suffixes":[]}],"booktitle":"DiSS 2003. Proceedings of the ISCA Tutorial and Research Workshop Disfluency in Spontaneous Speech","date":"2003","date-modified":"2018-05-13 21:46:06 +0000","editor":[{"propositions":[],"lastnames":["Eklund"],"firstnames":["Robert"],"suffixes":[]}],"eventdate":"2003-09-05/2003-09-08","keywords":"conversation, disfluencies, French, mass media, phonetics, radio, restarts, speaking styles, speech recognition, speech technology, spontaneous speech, transcription","location":"Göteborg, Sweden","pages":"67–70","title":"A disfluency study for cleaning spontaneous speech automatic transcripts and improving speech language models","url":"http://www.isca-speech.org/archive_open/diss_03/dis3_067.html","year":"2003","abstract":"The aim of this study is to elaborate a disfluent speech model by comparing different types of audio transcripts. The study makes use of 10 hours of French radio interview archives, involving journalists and personalities from political or civil society. A first type of transcripts is press-oriented where most disfluencies are discarded. For 10% of the corpus, we produced exact audio transcripts: all audible phenomena and overlapping speech segments are transcribed manually. In these iranscripts about 14% of the words correspond to disfluencies and discourse markers. The audio corpus has then been iranscribed using the LIMSI speech recognizer. With 8% of the corpus the disfluency words explain 12% of the overall error rate. This shows that disfluencies have no major effect on neighboring speech segments. Restarts are the most error prone, with a 36.9% within class error rate.","bdsk-file-1":"YnBsaXN0MDDUAQIDBAUGJCVYJHZlcnNpb25YJG9iamVjdHNZJGFyY2hpdmVyVCR0b3ASAAGGoKgHCBMUFRYaIVUkbnVsbNMJCgsMDxJXTlMua2V5c1pOUy5vYmplY3RzViRjbGFzc6INDoACgAOiEBGABIAFgAdccmVsYXRpdmVQYXRoWWFsaWFzRGF0YV8QXy4uLy4uLy4uL0JpYmxpb2dyYWZpYS9QYXBlcnMvQWRkYS1EZWNrZXIvQSBkaXNmbHVlbmN5IHN0dWR5IGZvciBjbGVhbmluZyBzcG9udGFuZW91cyBzcGVlY2gucGRm0hcLGBlXTlMuZGF0YU8RAlYAAAAAAlYAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAMv2H85IKwAAEIZlpx9BIGRpc2ZsdWVuY3kgc3R1ZHkjMTA4NjY1QTgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQhmWo1AnS5QAAAAAAAAAAAAMABAAACSAAAAAAAAAAAAAAAAAAAAALQWRkYS1EZWNrZXIAABAACAAAy/YDrgAAABEACAAA1Am2xQAAAAEAFBCGZacQhmWOAAX8RwAF+5gAAMBGAAIAaU1hY2ludG9zaCBIRDpVc2VyczoAam9hcXVpbV9sbGlzdGVycmk6AEJpYmxpb2dyYWZpYToAUGFwZXJzOgBBZGRhLURlY2tlcjoAQSBkaXNmbHVlbmN5IHN0dWR5IzEwODY2NUE4LnBkZgAADgBuADYAQQAgAGQAaQBzAGYAbAB1AGUAbgBjAHkAIABzAHQAdQBkAHkAIABmAG8AcgAgAGMAbABlAGEAbgBpAG4AZwAgAHMAcABvAG4AdABhAG4AZQBvAHUAcwAgAHMAcABlAGUAYwBoAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgBuVXNlcnMvam9hcXVpbV9sbGlzdGVycmkvQmlibGlvZ3JhZmlhL1BhcGVycy9BZGRhLURlY2tlci9BIGRpc2ZsdWVuY3kgc3R1ZHkgZm9yIGNsZWFuaW5nIHNwb250YW5lb3VzIHNwZWVjaC5wZGYAEwABLwAAFQACABj//wAAgAbSGxwdHlokY2xhc3NuYW1lWCRjbGFzc2VzXU5TTXV0YWJsZURhdGGjHR8gVk5TRGF0YVhOU09iamVjdNIbHCIjXE5TRGljdGlvbmFyeaIiIF8QD05TS2V5ZWRBcmNoaXZlctEmJ1Ryb290gAEACAARABoAIwAtADIANwBAAEYATQBVAGAAZwBqAGwAbgBxAHMAdQB3AIQAjgDwAPUA/QNXA1kDXgNpA3IDgAOEA4sDlAOZA6YDqQO7A74DwwAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAPF","bdsk-url-1":"http://www.isca-speech.org/archive_open/diss_03/dis3_067.html","bibtex":"@inproceedings{adda-decker_disfluency_2003,\n\tAuthor = {Adda-Decker, Martine and Habert, Benoît and Barras, Claude and Adda, Gilles and Boula de Mareüil, Philippe and Paroubek, Patrick},\n\tBooktitle = {DiSS 2003. Proceedings of the ISCA Tutorial and Research Workshop Disfluency in Spontaneous Speech},\n\tDate = {2003},\n\tDate-Modified = {2018-05-13 21:46:06 +0000},\n\tEditor = {Eklund, Robert},\n\tEventdate = {2003-09-05/2003-09-08},\n\tKeywords = {conversation, disfluencies, French, mass media, phonetics, radio, restarts, speaking styles, speech recognition, speech technology, spontaneous speech, transcription},\n\tLocation = {Göteborg, Sweden},\n\tPages = {67--70},\n\tTitle = {A disfluency study for cleaning spontaneous speech automatic transcripts and improving speech language models},\n\tUrl = {http://www.isca-speech.org/archive_open/diss_03/dis3_067.html},\n\tYear = {2003},\n\tAbstract = {The aim of this study is to elaborate a disfluent speech model by comparing different types of audio transcripts. The study makes use of 10 hours of French radio interview archives, involving journalists and personalities from political or civil society. A first type of transcripts is press-oriented where most disfluencies are discarded. For 10\\% of the corpus, we produced exact audio transcripts: all audible phenomena and overlapping speech segments are transcribed manually. In these iranscripts about 14\\% of the words correspond to disfluencies and discourse markers. The audio corpus has then been iranscribed using the LIMSI speech recognizer. With 8\\% of the corpus the disfluency words explain 12\\% of the overall error rate. This shows that disfluencies have no major effect on neighboring speech segments. Restarts are the most error prone, with a 36.9\\% within class error rate.},\n\tBdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGJCVYJHZlcnNpb25YJG9iamVjdHNZJGFyY2hpdmVyVCR0b3ASAAGGoKgHCBMUFRYaIVUkbnVsbNMJCgsMDxJXTlMua2V5c1pOUy5vYmplY3RzViRjbGFzc6INDoACgAOiEBGABIAFgAdccmVsYXRpdmVQYXRoWWFsaWFzRGF0YV8QXy4uLy4uLy4uL0JpYmxpb2dyYWZpYS9QYXBlcnMvQWRkYS1EZWNrZXIvQSBkaXNmbHVlbmN5IHN0dWR5IGZvciBjbGVhbmluZyBzcG9udGFuZW91cyBzcGVlY2gucGRm0hcLGBlXTlMuZGF0YU8RAlYAAAAAAlYAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAMv2H85IKwAAEIZlpx9BIGRpc2ZsdWVuY3kgc3R1ZHkjMTA4NjY1QTgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQhmWo1AnS5QAAAAAAAAAAAAMABAAACSAAAAAAAAAAAAAAAAAAAAALQWRkYS1EZWNrZXIAABAACAAAy/YDrgAAABEACAAA1Am2xQAAAAEAFBCGZacQhmWOAAX8RwAF+5gAAMBGAAIAaU1hY2ludG9zaCBIRDpVc2VyczoAam9hcXVpbV9sbGlzdGVycmk6AEJpYmxpb2dyYWZpYToAUGFwZXJzOgBBZGRhLURlY2tlcjoAQSBkaXNmbHVlbmN5IHN0dWR5IzEwODY2NUE4LnBkZgAADgBuADYAQQAgAGQAaQBzAGYAbAB1AGUAbgBjAHkAIABzAHQAdQBkAHkAIABmAG8AcgAgAGMAbABlAGEAbgBpAG4AZwAgAHMAcABvAG4AdABhAG4AZQBvAHUAcwAgAHMAcABlAGUAYwBoAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgBuVXNlcnMvam9hcXVpbV9sbGlzdGVycmkvQmlibGlvZ3JhZmlhL1BhcGVycy9BZGRhLURlY2tlci9BIGRpc2ZsdWVuY3kgc3R1ZHkgZm9yIGNsZWFuaW5nIHNwb250YW5lb3VzIHNwZWVjaC5wZGYAEwABLwAAFQACABj//wAAgAbSGxwdHlokY2xhc3NuYW1lWCRjbGFzc2VzXU5TTXV0YWJsZURhdGGjHR8gVk5TRGF0YVhOU09iamVjdNIbHCIjXE5TRGljdGlvbmFyeaIiIF8QD05TS2V5ZWRBcmNoaXZlctEmJ1Ryb290gAEACAARABoAIwAtADIANwBAAEYATQBVAGAAZwBqAGwAbgBxAHMAdQB3AIQAjgDwAPUA/QNXA1kDXgNpA3IDgAOEA4sDlAOZA6YDqQO7A74DwwAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAAPF},\n\tBdsk-Url-1 = {http://www.isca-speech.org/archive_open/diss_03/dis3_067.html}}\n\n","author_short":["Adda-Decker, M.","Habert, B.","Barras, C.","Adda, G.","Boula de Mareüil, P.","Paroubek, P."],"editor_short":["Eklund, R."],"key":"adda-decker_disfluency_2003","id":"adda-decker_disfluency_2003","bibbaseid":"addadecker-habert-barras-adda-boulademareil-paroubek-adisfluencystudyforcleaningspontaneousspeechautomatictranscriptsandimprovingspeechlanguagemodels-2003","role":"author","urls":{"Paper":"http://www.isca-speech.org/archive_open/diss_03/dis3_067.html"},"keyword":["conversation","disfluencies","French","mass media","phonetics","radio","restarts","speaking styles","speech recognition","speech technology","spontaneous speech","transcription"],"metadata":{"authorlinks":{}},"downloads":0,"html":""},"search_terms":["disfluency","study","cleaning","spontaneous","speech","automatic","transcripts","improving","speech","language","models","adda-decker","habert","barras","adda","boula de mareüil","paroubek"],"keywords":["conversation","disfluencies","french","mass media","phonetics","radio","restarts","speaking styles","speech recognition","speech technology","spontaneous speech","transcription"],"authorIDs":[],"dataSources":["qBn3jEfYwFvzHJsYh","BrMmNtBqG9aDvpsZn"]}