Annotation and analysis of disfluencies in a spontaneous speech corpus in Spanish. Rodríguez Fuentes, L. J., Torres, M. I., & Varona, A. In DiSS 2001. Proceedings of the ISCA Tutorial and Research Workshop Disfluency in Spontaneous Speech, pages 1–4, 2001.
Annotation and analysis of disfluencies in a spontaneous speech corpus in Spanish [link]Paper  abstract   bibtex   
A new database consisting of 227 dialogues in Spanish was annotated with disfluencies. Then a detailed analysis of the annotations was carried out. The database had been recorded according to the well knownWizard of Oz paradigm. Seventy-five speakers were given each one three different scenarios to make queries about timetables, prices and other conditions of train travels between two spanish cities. The notion of disfluency was relaxed to include any acoustic, lexical or syntactic feature that distinguises spontaneous from read speech. A specific XML annotation scheme was developed. A simple text editor was used to insert marks, and a specific parser was implemented to find errors in annotations. The analysis of annotations revealed that disfluencies were not uniformly distributed among either user turns or speakers. Most disfluencies were grouped into certain user turns, especially the first one. On the other hand, some speakers were remarkably more prone to hesitate, repeat or correct fragments of speech than others.
@inproceedings{rodriguez_fuentes_annotation_2001,
	Author = {Rodríguez Fuentes, Luis Javier and Torres, María Inés and Varona, Amparo},
	Booktitle = {DiSS 2001. Proceedings of the ISCA Tutorial and Research Workshop Disfluency in Spontaneous Speech},
	Date = {2001},
	Date-Modified = {2018-07-20 11:18:06 +0000},
	Eventdate = {2001-08-29/2001-08-31},
	Keywords = {labelling and annotation, language resources, disfluencies, phonetics, Spanish, speaking styles, speech corpus, spontaneous speech, transcription},
	Location = {Edinburgh, Scotland, UK},
	Pages = {1--4},
	Title = {Annotation and analysis of disfluencies in a spontaneous speech corpus in Spanish},
	Url = {http://www.isca-speech.org/archive_open/diss_01/dis1_001.html},
	Year = {2001},
	Abstract = {A new database consisting of 227 dialogues in Spanish was annotated with disfluencies. Then a detailed analysis of the annotations was carried out. The database had been recorded according to the well knownWizard of Oz paradigm. Seventy-five speakers were given each one three different scenarios to make queries about timetables, prices and other conditions of train travels between two spanish cities. The notion of disfluency was relaxed to include any acoustic, lexical or syntactic feature that distinguises spontaneous from read speech. A specific XML annotation scheme was developed. A simple text editor was used to insert marks, and a specific parser was implemented to find errors in annotations. The analysis of annotations revealed that disfluencies were not uniformly distributed among either user turns or speakers. Most disfluencies were grouped into certain user turns, especially the first one. On the other hand, some speakers were remarkably more prone to hesitate, repeat or correct fragments of speech than others.},
	Bdsk-File-1 = {YnBsaXN0MDDUAQIDBAUGJCVYJHZlcnNpb25YJG9iamVjdHNZJGFyY2hpdmVyVCR0b3ASAAGGoKgHCBMUFRYaIVUkbnVsbNMJCgsMDxJXTlMua2V5c1pOUy5vYmplY3RzViRjbGFzc6INDoACgAOiEBGABIAFgAdccmVsYXRpdmVQYXRoWWFsaWFzRGF0YW8QcwAuAC4ALwAuAC4ALwAuAC4ALwBCAGkAYgBsAGkAbwBnAHIAYQBmAGkAYQAvAFAAYQBwAGUAcgBzAC8AUgBvAGQAcgBpAwEAZwB1AGUAegAgAEYAdQBlAG4AdABlAHMALwBBAG4AbgBvAHQAYQB0AGkAbwBuACAAYQBuAGQAIABhAG4AYQBsAHkAcwBpAHMAIABvAGYAIABkAGkAcwBmAGwAdQBlAG4AYwBpAGUAcwAgAGkAbgAgAGEAIABzAHAAbwBuAHQAYQBuAGUAbwB1AHMAIABzAHAAZQBlAGMAaAAuAHAAZABm0hcLGBlXTlMuZGF0YU8RApIAAAAAApIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAMv2H85IKwAAEIZ00B9Bbm5vdGF0aW9uIGFuZCBhbmEjMTA4Njc0RDEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQhnTR0+pIGgAAAAAAAAAAAAMABAAACSAAAAAAAAAAAAAAAAAAAAARUm9kcpJndWV6IEZ1ZW50ZXMAABAACAAAy/YDrgAAABEACAAA0+or+gAAAAEAFBCGdNAQhmWOAAX8RwAF+5gAAMBGAAIAb01hY2ludG9zaCBIRDpVc2VyczoAam9hcXVpbV9sbGlzdGVycmk6AEJpYmxpb2dyYWZpYToAUGFwZXJzOgBSb2Rykmd1ZXogRnVlbnRlczoAQW5ub3RhdGlvbiBhbmQgYW5hIzEwODY3NEQxLnBkZgAADgCIAEMAQQBuAG4AbwB0AGEAdABpAG8AbgAgAGEAbgBkACAAYQBuAGEAbAB5AHMAaQBzACAAbwBmACAAZABpAHMAZgBsAHUAZQBuAGMAaQBlAHMAIABpAG4AIABhACAAcwBwAG8AbgB0AGEAbgBlAG8AdQBzACAAcwBwAGUAZQBjAGgALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASAINVc2Vycy9qb2FxdWltX2xsaXN0ZXJyaS9CaWJsaW9ncmFmaWEvUGFwZXJzL1JvZHJpzIFndWV6IEZ1ZW50ZXMvQW5ub3RhdGlvbiBhbmQgYW5hbHlzaXMgb2YgZGlzZmx1ZW5jaWVzIGluIGEgc3BvbnRhbmVvdXMgc3BlZWNoLnBkZgAAEwABLwAAFQACABj//wAAgAbSGxwdHlokY2xhc3NuYW1lWCRjbGFzc2VzXU5TTXV0YWJsZURhdGGjHR8gVk5TRGF0YVhOU09iamVjdNIbHCIjXE5TRGljdGlvbmFyeaIiIF8QD05TS2V5ZWRBcmNoaXZlctEmJ1Ryb290gAEACAARABoAIwAtADIANwBAAEYATQBVAGAAZwBqAGwAbgBxAHMAdQB3AIQAjgF3AXwBhAQaBBwEIQQsBDUEQwRHBE4EVwRcBGkEbAR+BIEEhgAAAAAAAAIBAAAAAAAAACgAAAAAAAAAAAAAAAAAAASI},
	Bdsk-Url-1 = {http://www.isca-speech.org/archive_open/diss_01/dis1_001.html}}

Downloads: 0