Development of an information retrieval tool for biomedical patents. Alves, T., Rodrigues, R., Costa, H., & Rocha, M. Computer Methods and Programs in Biomedicine, 159:125–134, June, 2018.
Development of an information retrieval tool for biomedical patents [link]Paper  doi  abstract   bibtex   
Methods: The pipeline was developed within @Note2, an open-source computational framework for BioTM, adding a number of modules to the core libraries, including patent metadata and full text retrieval, PDF to text conversion and optical character recognition. Also, user interfaces were developed for the main operations materialized in a new @Note2 plug-in. Results: The integration of these tools in @Note2 opens opportunities to run BioTM tools over patent texts, including tasks from Information Extraction, such as Named Entity Recognition or Relation Extraction. We demonstrated the pipeline’s main functions with a case study, using an available benchmark dataset from BioCreative challenges. Also, we show the use of the plug-in with a user query related to the production of vanillin. Conclusions: This work makes available all the relevant content from patents to the scientific community, decreasing drastically the time required for this task, and provides graphical interfaces to ease the use of these tools.
@article{alves_development_2018,
	title = {Development of an information retrieval tool for biomedical patents},
	volume = {159},
	issn = {01692607},
	url = {https://linkinghub.elsevier.com/retrieve/pii/S0169260717310568},
	doi = {10.1016/j.cmpb.2018.03.012},
	abstract = {Methods: The pipeline was developed within @Note2, an open-source computational framework for BioTM, adding a number of modules to the core libraries, including patent metadata and full text retrieval, PDF to text conversion and optical character recognition. Also, user interfaces were developed for the main operations materialized in a new @Note2 plug-in.
Results: The integration of these tools in @Note2 opens opportunities to run BioTM tools over patent texts, including tasks from Information Extraction, such as Named Entity Recognition or Relation Extraction. We demonstrated the pipeline’s main functions with a case study, using an available benchmark dataset from BioCreative challenges. Also, we show the use of the plug-in with a user query related to the production of vanillin.
Conclusions: This work makes available all the relevant content from patents to the scientific community, decreasing drastically the time required for this task, and provides graphical interfaces to ease the use of these tools.},
	language = {en},
	urldate = {2018-11-26},
	journal = {Computer Methods and Programs in Biomedicine},
	author = {Alves, Tiago and Rodrigues, Rúben and Costa, Hugo and Rocha, Miguel},
	month = jun,
	year = {2018},
	pages = {125--134},
}

Downloads: 0