BBN PLUM System as Used for MUC-4. Ayuso, D., Boisen, S., Fox, H., J., Gish, H., Ingria, B., & Weischedel, R. In Proceedings of the Fourth Message Understanding Conference MUC4, 1992.
BBN PLUM System as Used for MUC-4 [pdf]Website  abstract   bibtex   
Traditional approaches to the problem of extracting data from texts have emphasized hand-crafted linguisti c knowledge . In contrast, BBN's PLUM system (Probabilistic Language Understanding Model) was developed a s part of a DARPA-funded research effort on integrating probabilistic language models with more traditional linguistic techniques . Our research and development goals are • more rapid development of new applications, • the ability to train (and re-train) systems based on user markings of correct and incorrect output, • more accurate selection among interpretations when more than one is found, an d • more robust partial interpretation when no complete interpretation can be found. A central assumption of our approach is that in processing unrestricted text for data extraction, a non-trivia l amount of the text will not be understood. As a result, all components of PLUM are designed to operate on partially understood input, taking advantage of information when available, and not failing when information is unavailable . We had previously performed experiments on components of the system with texts from the Wall Stree t Journal, however, the MUC-3 task was the first end-to-end application of PLUM. Very little hand-tuning of knowledge bases was done for MUC-4 ; since MUC-3, the system architecture as depicted in figure 1 has remained essentially the same. In addition to participating in MUC-4, since MUC-3 we focused on porting to new domains and a new language, and on performing various experiments designed to control recall/precision tradeoffs . To support these goals, the preprocessing component and the fragment combiner were made declarative; the semantics component was generalized to use probabilities on word senses ; we expanded our treatment of reference ; we enlarged the set of system parameters at all levels ; and we created a new probabilistic classifier for text relevance which filter s discourse events.

Downloads: 0