Scientific Discovery as Link Prediction in Influence and Citation Graphs

Scientific Discovery as Link Prediction in Influence and Citation Graphs. Luo, F., Valenzuela-Escarcega, M. A., Hahn-Powell, G., & Surdeanu, M. In TextGraphs: 12th Workshop on Graph-Based Natural Language Processing, 2018. NAACL.

Slides

Paper abstract bibtex 60 downloads

We introduce a machine learning approach for the identification of ``white spaces'' in scientific knowledge. Our approach addresses this task as link prediction over a graph that contains over 2M influence statements such as ``CTCF activates FOXA1'', which were automatically extracted using open-domain machine reading. We model this prediction task using graph-based features extracted from the above influence graph, as well as from a citation graph that captures scientific communities. We evaluated the proposed approach through backtesting. Although the data is heavily unbalanced (50 times more negative examples than positives), our approach predicts which influence links will be discovered in the ``near future'' with a F1 score of 27 points, and a mean average precision of 68%.

@inproceedings{whitespaces-identification2018,
  title={Scientific Discovery as Link Prediction in Influence and Citation Graphs},
  author={Fan Luo and
      	Marco A. Valenzuela-Escarcega and
        Gus Hahn-Powell and
        Mihai Surdeanu},
  booktitle = {TextGraphs: 12th Workshop on Graph-Based Natural Language Processing},
  year={2018},
  abstract = {We introduce a machine learning approach for the identification of ``white spaces'' in scientific knowledge. Our approach addresses this task as link prediction over a graph that contains over 2M influence statements such as ``CTCF activates FOXA1'', which were automatically extracted using open-domain machine reading. We model this prediction task using graph-based features extracted from the above influence graph, as well as from a citation graph that captures scientific communities. We evaluated the proposed approach through backtesting. Although the data is heavily unbalanced (50 times more negative examples than positives), our approach predicts which influence links will be discovered in the ``near future'' with a F1 score of 27 points, and a mean average precision of 68\%. },
  organization={NAACL},
  url_Slides={http://clulab.org/papers/TextGraphs.pdf},
  url={http://clulab.org/papers/ScientificDiscoveryasLinkPredictioninInfluenceandCitationGraphs.pdf}
}

Downloads: 60

{"_id":"ggz7Z9jYDH3pcmNL8","bibbaseid":"luo-valenzuelaescarcega-hahnpowell-surdeanu-scientificdiscoveryaslinkpredictionininfluenceandcitationgraphs-2018","downloads":60,"creationDate":"2018-04-18T18:32:24.688Z","title":"Scientific Discovery as Link Prediction in Influence and Citation Graphs","author_short":["Luo, F.","Valenzuela-Escarcega, M. A.","Hahn-Powell, G.","Surdeanu, M."],"year":2018,"bibtype":"inproceedings","biburl":"https://clulab.github.io/clulab_publications.bib","bibdata":{"bibtype":"inproceedings","type":"inproceedings","title":"Scientific Discovery as Link Prediction in Influence and Citation Graphs","author":[{"firstnames":["Fan"],"propositions":[],"lastnames":["Luo"],"suffixes":[]},{"firstnames":["Marco","A."],"propositions":[],"lastnames":["Valenzuela-Escarcega"],"suffixes":[]},{"firstnames":["Gus"],"propositions":[],"lastnames":["Hahn-Powell"],"suffixes":[]},{"firstnames":["Mihai"],"propositions":[],"lastnames":["Surdeanu"],"suffixes":[]}],"booktitle":"TextGraphs: 12th Workshop on Graph-Based Natural Language Processing","year":"2018","abstract":"We introduce a machine learning approach for the identification of ``white spaces'' in scientific knowledge. Our approach addresses this task as link prediction over a graph that contains over 2M influence statements such as ``CTCF activates FOXA1'', which were automatically extracted using open-domain machine reading. We model this prediction task using graph-based features extracted from the above influence graph, as well as from a citation graph that captures scientific communities. We evaluated the proposed approach through backtesting. Although the data is heavily unbalanced (50 times more negative examples than positives), our approach predicts which influence links will be discovered in the ``near future'' with a F1 score of 27 points, and a mean average precision of 68%. ","organization":"NAACL","url_slides":"http://clulab.org/papers/TextGraphs.pdf","url":"http://clulab.org/papers/ScientificDiscoveryasLinkPredictioninInfluenceandCitationGraphs.pdf","bibtex":"@inproceedings{whitespaces-identification2018,\n title={Scientific Discovery as Link Prediction in Influence and Citation Graphs},\n author={Fan Luo and\n \tMarco A. Valenzuela-Escarcega and\n Gus Hahn-Powell and\n Mihai Surdeanu},\n booktitle = {TextGraphs: 12th Workshop on Graph-Based Natural Language Processing},\n year={2018},\n abstract = {We introduce a machine learning approach for the identification of ``white spaces'' in scientific knowledge. Our approach addresses this task as link prediction over a graph that contains over 2M influence statements such as ``CTCF activates FOXA1'', which were automatically extracted using open-domain machine reading. We model this prediction task using graph-based features extracted from the above influence graph, as well as from a citation graph that captures scientific communities. We evaluated the proposed approach through backtesting. Although the data is heavily unbalanced (50 times more negative examples than positives), our approach predicts which influence links will be discovered in the ``near future'' with a F1 score of 27 points, and a mean average precision of 68\\%. },\n organization={NAACL},\n url_Slides={http://clulab.org/papers/TextGraphs.pdf},\n url={http://clulab.org/papers/ScientificDiscoveryasLinkPredictioninInfluenceandCitationGraphs.pdf}\n}\n","author_short":["Luo, F.","Valenzuela-Escarcega, M. A.","Hahn-Powell, G.","Surdeanu, M."],"key":"whitespaces-identification2018","id":"whitespaces-identification2018","bibbaseid":"luo-valenzuelaescarcega-hahnpowell-surdeanu-scientificdiscoveryaslinkpredictionininfluenceandcitationgraphs-2018","role":"author","urls":{" slides":"http://clulab.org/papers/TextGraphs.pdf","Paper":"http://clulab.org/papers/ScientificDiscoveryasLinkPredictioninInfluenceandCitationGraphs.pdf"},"metadata":{"authorlinks":{}},"downloads":60},"search_terms":["scientific","discovery","link","prediction","influence","citation","graphs","luo","valenzuela-escarcega","hahn-powell","surdeanu"],"keywords":[],"authorIDs":[],"dataSources":["Z8RWFZH5Fm67zCFX3","m5m3npBwJE44C9JYs"]}