The Knowledge Graph as the Default Data Model for Machine Learning. Wilcke, X., Bloem, P., & De Boer, V. Data Science, IOS Press, 10, 2017.
doi  abstract   bibtex   
In modern machine learning, raw data is the pre-ferred input for our models. Where a decade ago data sci-entists were still engineering features, manually picking out the details they thought salient, they now prefer the data as raw as possible. As long as we can assume that all relevant and irrelevant information is present in the input data, we can design deep models that build up intermediate represen-tations to sift out relevant features. In some areas, however, we struggle to find this raw form of data. One such area involves heterogeneous knowledge: entities, their attributes and internal relations. The Semantic Web community has in-vested decades of work on just this problem: how to repre-sent knowledge, in various domains, in as raw and as usable a form as possible, satisfying many use cases. This work has led to the Linked Open Data Cloud, a vast and distributed knowledge graph. If we can develop methods that operate on this raw form of data–the knowledge graph–we can dispense with a great deal of ad-hoc feature engineering and train deep models end-to-end in many more domains. In this position paper, we describe current research in this area and discuss some of the promises and challenges of this approach.
@article{d589859477984c67919e8ae0d5586e4c,
  title     = "The Knowledge Graph as the Default Data Model for Machine Learning",
  abstract  = "In modern machine learning, raw data is the pre-ferred input for our models. Where a decade ago data sci-entists were still engineering features, manually picking out the details they thought salient, they now prefer the data as raw as possible. As long as we can assume that all relevant and irrelevant information is present in the input data, we can design deep models that build up intermediate represen-tations to sift out relevant features. In some areas, however, we struggle to find this raw form of data. One such area involves heterogeneous knowledge: entities, their attributes and internal relations. The Semantic Web community has in-vested decades of work on just this problem: how to repre-sent knowledge, in various domains, in as raw and as usable a form as possible, satisfying many use cases. This work has led to the Linked Open Data Cloud, a vast and distributed knowledge graph. If we can develop methods that operate on this raw form of data–the knowledge graph–we can dispense with a great deal of ad-hoc feature engineering and train deep models end-to-end in many more domains. In this position paper, we describe current research in this area and discuss some of the promises and challenges of this approach.",
  keywords  = "End-to-End Learning, Knowledge Graphs, Machine Learning, Position paper, Semantic Web",
  author    = "Xander Wilcke and Peter Bloem and {De Boer}, Victor",
  year      = "2017",
  month     = "10",
  doi       = "10.3233/DS-170007",
  pages     = "1--19",
  journal   = "Data Science",
  issn      = "2451-8484",
  publisher = "IOS Press",
}

Downloads: 0