Information extraction from broadcast news. Gotoh, Y. & Renals, S. Phil Trans R Soc Lond A, 358(1769):12, The Royal Society, 2000.
Information extraction from broadcast news [pdf]Paper  Information extraction from broadcast news [link]Website  abstract   bibtex   
This paper discusses the development of trainable statistical models for extracting content from television and radio news broadcasts. In particular, we concentrate on statistical finite-state models for identifying proper names and other named entities in broadcast speech. Two models are presented: the first represents name class information as a word attribute; the second represents both word-word and class-class transitions explicitly. A common n-gram-based formulation is used for both models. The task of named-entity identification is characterized by relatively sparse training data, and issues related to smoothing are discussed. Experiments are reported using the DARPA/NIST Hub-4E evaluation for North American broadcast news.

Downloads: 0