\n \n \n
\n
\n\n \n \n \n \n \n \n PyrEval: An Automated Method for Summary Content Analysis.\n \n \n \n \n\n\n \n Gao, Y.; Warner, A.; and Passonneau, R. J.\n\n\n \n\n\n\n In
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May 2018. European Language Resource Association\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{gao_pyreval_2018,\n\taddress = {Miyazaki, Japan},\n\ttitle = {{PyrEval}: {An} {Automated} {Method} for {Summary} {Content} {Analysis}},\n\turl = {http://www.lrec-conf.org/proceedings/lrec2018/pdf/1096.pdf},\n\tabstract = {The pyramid method is a content analysis approach in automatic summarization evaluation for manual construction of a content model from reference summaries, and manual scoring of unseen summaries with the pyramid model. PyrEval automates the manual pyramid method. PyrEval uses low-dimension distributional semantics to represent phrase meanings, and a new algorithm, EDUA (Emergent Discovery of Units of Attraction), to solve a set cover problem to construct the content model from vectorized phrases. Because the vectors are pretrained, and EDUA is an efficient greedy algorithm, PyrEval can apply pyramid content evaluation with no retraining, and in excellent time. Moreover, PyrEval has been tested on many datasets derived from humans and machine generated summaries, and shown good performance on both.},\n\tbooktitle = {Proceedings of the {Eleventh} {International} {Conference} on {Language} {Resources} and {Evaluation} ({LREC} 2018)},\n\tpublisher = {European Language Resource Association},\n\tauthor = {Gao, Yanjun and Warner, Andrew and Passonneau, Rebecca J.},\n\tmonth = may,\n\tyear = {2018},\n}\n\n
\n
\n\n\n
\n The pyramid method is a content analysis approach in automatic summarization evaluation for manual construction of a content model from reference summaries, and manual scoring of unseen summaries with the pyramid model. PyrEval automates the manual pyramid method. PyrEval uses low-dimension distributional semantics to represent phrase meanings, and a new algorithm, EDUA (Emergent Discovery of Units of Attraction), to solve a set cover problem to construct the content model from vectorized phrases. Because the vectors are pretrained, and EDUA is an efficient greedy algorithm, PyrEval can apply pyramid content evaluation with no retraining, and in excellent time. Moreover, PyrEval has been tested on many datasets derived from humans and machine generated summaries, and shown good performance on both.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Automated Content Analysis: A Case Study of Computer Science Student Summaries.\n \n \n \n \n\n\n \n Gao, Y.; M.Davies, P.; and Passonneau, R. J.\n\n\n \n\n\n\n In
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 264–272, New Orleans, Louisiana, 2018. Association for Computational Linguistics\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{gao_automated_2018,\n\taddress = {New Orleans, Louisiana},\n\ttitle = {Automated {Content} {Analysis}: {A} {Case} {Study} of {Computer} {Science} {Student} {Summaries}},\n\tshorttitle = {Automated {Content} {Analysis}},\n\turl = {http://aclweb.org/anthology/W18-0531},\n\tdoi = {10.18653/v1/W18-0531},\n\tlanguage = {en},\n\turldate = {2022-04-19},\n\tbooktitle = {Proceedings of the {Thirteenth} {Workshop} on {Innovative} {Use} of {NLP} for {Building} {Educational} {Applications}},\n\tpublisher = {Association for Computational Linguistics},\n\tauthor = {Gao, Yanjun and M.Davies, Patricia and Passonneau, Rebecca J.},\n\tyear = {2018},\n\tpages = {264--272},\n}\n\n
\n
\n\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n Testing a Knowledge Inquiry System on Question Answering Tasks.\n \n \n \n\n\n \n Zafeiroudi, K. D.; Eckman, L.; and Passonneau, R. J.\n\n\n \n\n\n\n In
Joint Proceedings of ISWC 2018 Workshops SemDeep-4 and NLIWoD-4., Monterey, CA, October 2018. \n
Best Paper.\n\n
\n\n
\n\n
\n\n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{zafeiroudi_testing_2018,\n\taddress = {Monterey, CA},\n\ttitle = {Testing a {Knowledge} {Inquiry} {System} on\nQuestion {Answering} {Tasks}},\n\tabstract = {Question-Answering systems enable users to retrieve answers to factual questions from various kinds of knowledge sources, but do not address how to respond cooperatively. We present InK, an initial inquiry system for RDF knowledge graphs that aims to return relevant responses, even when an answer cannot be found. It assembles knowledge\nrelevant to the entities mentioned in the question without translating the input question into a query language. A user study indicates responses are found to be intelligible and relevant. Evaluation of questions with known answers gives high recall of 0.70 averaged on three QA datasets.},\n\tbooktitle = {Joint {Proceedings} of {ISWC} 2018 {Workshops} {SemDeep}-4 and {NLIWoD}-4.},\n\tauthor = {Zafeiroudi, Kyriaki D. and Eckman, Leah and Passonneau, Rebecca J.},\n\tmonth = oct,\n\tyear = {2018},\n\tnote = {Best Paper.},\n}\n\n
\n
\n\n\n
\n Question-Answering systems enable users to retrieve answers to factual questions from various kinds of knowledge sources, but do not address how to respond cooperatively. We present InK, an initial inquiry system for RDF knowledge graphs that aims to return relevant responses, even when an answer cannot be found. It assembles knowledge relevant to the entities mentioned in the question without translating the input question into a query language. A user study indicates responses are found to be intelligible and relevant. Evaluation of questions with known answers gives high recall of 0.70 averaged on three QA datasets.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Wise Crowd Content Assessment and Educational Rubrics.\n \n \n \n \n\n\n \n Passonneau, R. J.; Poddar, A.; Gite, G.; Krivokapic, A.; Yang, Q.; and Perin, D.\n\n\n \n\n\n\n
International Journal of Artificial Intelligence in Education, 28(1): 29–55. March 2018.\n
\n\n
\n\n
\n\n
\n\n \n \n Paper\n \n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@article{passonneau_wise_2018,\n\tseries = {Special {Issue} on {Multidisciplinary} {Approaches} to {AI} and {Education} for {Reading} and {Writing}},\n\ttitle = {Wise {Crowd} {Content} {Assessment} and {Educational} {Rubrics}},\n\tvolume = {28},\n\tissn = {1560-4292},\n\tshorttitle = {Wise {Crowd} {Content} {Assessment}},\n\turl = {https://link.springer.com/article/10.1007/s40593-016-0128-6#citeas},\n\tdoi = {https://doi.org/10.1007/s40593-016-0128-6},\n\tabstract = {Development of reliable rubrics for educational intervention studies that address reading and writing skills is labor-intensive, and could benefit from an automated approach. We compare a main ideas rubric used in a successful writing intervention study to a highly reliable wise-crowd content assessment method developed to evaluate machine-generated summaries. The ideas in the educational rubric were extracted from a source text that students were asked to summarize. The wise-crowd content assessment model is derived from summaries written by an independent group of proficient students who read the same source text, and followed the same instructions to write their summaries. The resulting content model includes a ranking over the derived content units. All main ideas in the rubric appear prominently in the wise-crowd content model. We present two methods that automate the content assessment. Scores based on the wise-crowd content assessment, both manual and automated, have high correlations with the main ideas rubric. The automated content assessment methods have several advantages over related methods, including high correlations with corresponding manual scores, a need for only half a dozen models instead of hundreds, and interpretable scores that independently assess content quality and coverage.},\n\tnumber = {1},\n\tjournal = {International Journal of Artificial Intelligence in Education},\n\tauthor = {Passonneau, Rebecca J. and Poddar, Ananya and Gite, Gaurav and Krivokapic, Alisa and Yang, Qian and Perin, Dolores},\n\tmonth = mar,\n\tyear = {2018},\n\tpages = {29--55},\n}\n\n
\n
\n\n\n
\n Development of reliable rubrics for educational intervention studies that address reading and writing skills is labor-intensive, and could benefit from an automated approach. We compare a main ideas rubric used in a successful writing intervention study to a highly reliable wise-crowd content assessment method developed to evaluate machine-generated summaries. The ideas in the educational rubric were extracted from a source text that students were asked to summarize. The wise-crowd content assessment model is derived from summaries written by an independent group of proficient students who read the same source text, and followed the same instructions to write their summaries. The resulting content model includes a ranking over the derived content units. All main ideas in the rubric appear prominently in the wise-crowd content model. We present two methods that automate the content assessment. Scores based on the wise-crowd content assessment, both manual and automated, have high correlations with the main ideas rubric. The automated content assessment methods have several advantages over related methods, including high correlations with corresponding manual scores, a need for only half a dozen models instead of hundreds, and interpretable scores that independently assess content quality and coverage.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n Prediction of a hotspot pattern in keyword search results.\n \n \n \n\n\n \n Gao, J.; Radeva, A.; Shen, C.; Wang, S.; Wang, Q.; and Passonneau, R. J.\n\n\n \n\n\n\n
Computer Speech & Language, 48: 80–102. March 2018.\n
\n\n
\n\n
\n\n
\n\n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@article{gao_prediction_2018,\n\ttitle = {Prediction of a hotspot pattern in keyword search results},\n\tvolume = {48},\n\tdoi = {https://doi.org/10.1016/j.csl.2017.10.005},\n\tabstract = {This paper identifies and models a phenomenon observed across low-resource languages in keyword search results from speech retrieval systems where the speech recognition has high error rate, due to very limited training data. High confidence correct detections (hccds) of keywords are rare, yet often succeed one another closely in time. We refer to these close sequences of hccds as keyword hotspots. The ability to predict keyword hotspots could support speech retrieval, and provide new insights into the behavior of speech recognition systems. We treat hotspot prediction as a binary classification task on all word-sized time intervals in an audio file of a telephone conversation, using prosodic features as predictors. Rare events that follow this pattern are often modeled as a self-exciting point process (sepp), meaning the occurrence of a rare event excites a following one. To label successive points in time as occurring within a hotspot or not, we fit a sepp function to the distribution of hccds in the keyword search output. Two major learning challenges are that the size of the positive class is very small, and the training and test data have dissimilar distributions. To address these challenges, we develop a novel data selection framework that chooses training data with good generalization properties. Results exhibit superior generalization performance.},\n\tjournal = {Computer Speech \\& Language},\n\tauthor = {Gao, Jie and Radeva, Axinia and Shen, Chuyao and Wang, Shiqi and Wang, Qianbo and Passonneau, Rebecca J.},\n\tmonth = mar,\n\tyear = {2018},\n\tpages = {80--102},\n}\n\n
\n
\n\n\n
\n This paper identifies and models a phenomenon observed across low-resource languages in keyword search results from speech retrieval systems where the speech recognition has high error rate, due to very limited training data. High confidence correct detections (hccds) of keywords are rare, yet often succeed one another closely in time. We refer to these close sequences of hccds as keyword hotspots. The ability to predict keyword hotspots could support speech retrieval, and provide new insights into the behavior of speech recognition systems. We treat hotspot prediction as a binary classification task on all word-sized time intervals in an audio file of a telephone conversation, using prosodic features as predictors. Rare events that follow this pattern are often modeled as a self-exciting point process (sepp), meaning the occurrence of a rare event excites a following one. To label successive points in time as occurring within a hotspot or not, we fit a sepp function to the distribution of hccds in the keyword search output. Two major learning challenges are that the size of the positive class is very small, and the training and test data have dissimilar distributions. To address these challenges, we develop a novel data selection framework that chooses training data with good generalization properties. Results exhibit superior generalization performance.\n
\n\n\n
\n\n\n
\n\n\n\n\n\n