BENTO: A Visual Platform for Building Clinical NLP Pipelines Based on CodaLab. Jin, Y., Li, F., & Yu, H. In 2020 Annual Conference of the Association for Computational Linguistics (ACL), pages 95–100, July, 2020. NIHMSID: NIHMS1644629
doi  abstract   bibtex   
CodaLab is an open-source web-based platform for collaborative computational research. Although CodaLab has gained popularity in the research community, its interface has limited support for creating reusable tools that can be easily applied to new datasets and composed into pipelines. In clinical domain, natural language processing (NLP) on medical notes generally involves multiple steps, like tokenization, named entity recognition, etc. Since these steps require different tools which are usually scattered in different publications, it is not easy for researchers to use them to process their own datasets. In this paper, we present BENTO, a workflow management platform with a graphic user interface (GUI) that is built on top of CodaLab, to facilitate the process of building clinical NLP pipelines. BENTO comes with a number of clinical NLP tools that have been pre-trained using medical notes and expert annotations and can be readily used for various clinical NLP tasks. It also allows researchers and developers to create their custom tools (e.g., pre-trained NLP models) and use them in a controlled and reproducible way. In addition, the GUI interface enables researchers with limited computer background to compose tools into NLP pipelines and then apply the pipelines on their own datasets in a "what you see is what you get" (WYSIWYG) way. Although BENTO is designed for clinical NLP applications, the underlying architecture is flexible to be tailored to any other domains.
@inproceedings{jin_bento_2020,
	title = {{BENTO}: {A} {Visual} {Platform} for {Building} {Clinical} {NLP} {Pipelines} {Based} on {CodaLab}.},
	doi = {10.18653/v1/2020.acl-demos.13},
	abstract = {CodaLab is an open-source web-based platform for collaborative computational research. Although CodaLab has gained popularity in the research community, its interface has limited support for creating reusable tools that can be easily applied to new datasets and composed into pipelines. In clinical domain, natural language processing (NLP) on medical notes generally involves multiple steps, like tokenization, named entity recognition, etc. Since these steps require different tools which are usually scattered in different publications, it is not easy for researchers to use them to process their own datasets. In this paper, we present BENTO, a workflow management platform with a graphic user interface (GUI) that is built on top of CodaLab, to facilitate the process of building clinical NLP pipelines. BENTO comes with a number of clinical NLP tools that have been pre-trained using medical notes and expert annotations and can be readily used for various clinical NLP tasks. It also allows researchers and developers to create their custom tools (e.g., pre-trained NLP models) and use them in a controlled and reproducible way. In addition, the GUI interface enables researchers with limited computer background to compose tools into NLP pipelines and then apply the pipelines on their own datasets in a "what you see is what you get" (WYSIWYG) way. Although BENTO is designed for clinical NLP applications, the underlying architecture is flexible to be tailored to any other domains.},
	booktitle = {2020 {Annual} {Conference} of the {Association} for {Computational} {Linguistics} ({ACL})},
	author = {Jin, Yonghao and Li, Fei and Yu, Hong},
	month = jul,
	year = {2020},
	pmcid = {PMC7679080},
	pmid = {33223604},
	note = {NIHMSID: NIHMS1644629},
	pages = {95--100},
}

Downloads: 0