In *Proceedings of the First Workshop on Scholarly Document Processing (SDP@EMNLP)*, pages 138–147, Online, 2020. ACL. Core Rank A

Paper doi abstract bibtex

Paper doi abstract bibtex

A large amount of scientific knowledge is represented within mixed forms of natural language texts and mathematical formulae. Therefore, a collaboration of natural language processing and formula analyses, so-called mathematical language processing, is necessary to enable computers to understand and retrieve information from the documents. However, as we will show in this project, a mathematical notation can change its meaning even within the scope of a single paragraph. This flexibility makes it difficult to extract the exact meaning of a mathematical formula. In this project, we will propose a new task direction for grounding mathematical formulae. Particularly, we are addressing the widespread misconception of various research projects in mathematical information retrieval, which presume that mathematical notations have a fixed meaning within a single document. We manually annotated a long scientific paper to illustrate the task concept. Our high inter-annotator agreement shows that the task is well understood for humans. Our results indicate that it is worthwhile to grow the techniques for the proposed task to contribute to the further progress of mathematical language processing.

@inproceedings{BibbaseAsakuraGAM20, address = {Online}, title = {Towards {Grounding} of {Formulae}}, url = {https://www.aclweb.org/anthology/2020.sdp-1.16}, doi = {10/gjzg2r}, abstract = {A large amount of scientific knowledge is represented within mixed forms of natural language texts and mathematical formulae. Therefore, a collaboration of natural language processing and formula analyses, so-called mathematical language processing, is necessary to enable computers to understand and retrieve information from the documents. However, as we will show in this project, a mathematical notation can change its meaning even within the scope of a single paragraph. This flexibility makes it difficult to extract the exact meaning of a mathematical formula. In this project, we will propose a new task direction for grounding mathematical formulae. Particularly, we are addressing the widespread misconception of various research projects in mathematical information retrieval, which presume that mathematical notations have a fixed meaning within a single document. We manually annotated a long scientific paper to illustrate the task concept. Our high inter-annotator agreement shows that the task is well understood for humans. Our results indicate that it is worthwhile to grow the techniques for the proposed task to contribute to the further progress of mathematical language processing.}, language = {en}, urldate = {2021-08-02}, booktitle = {Proceedings of the {First} {Workshop} on {Scholarly} {Document} {Processing} ({SDP}@{EMNLP})}, publisher = {ACL}, author = {Asakura, Takuto and Greiner-Petter, André and Aizawa, Akiko and Miyao, Yusuke}, year = {2020}, note = {Core Rank A}, pages = {138--147}, }

Downloads: 0