Integrating nested data into knowledge graphs with RML fields
Thomas Delva, Dylan Van Assche,
Pieter Heyvaert
,
Ben De Meester
,
Anastasia Dimou
To support business decisions or improve operational efficiency, heterogeneous data is often integrated into a knowledge graph. This integration can be achieved with one of the existing declarative mapping languages, which offer declarative data integration in the form of knowledge graphs. However, current mapping languages cannot always integrate data with nested structure, such as JSON or XML files or JSON documents stored in a database column. We designed a backwards-compatible extension of the RDF Mapping Language (RML) which empowers it to integrate nested data: RML fields. In this paper, we introduce RML fields, compare it with the state of the art in mapping languages, and validate it on mapping challenges formulated by the Knowledge Graph Construction W3C community group. Our extension allows to address several of the challenges related to nested data that were previously not possible. RML fields can be used to integrate even more datasets into knowledge graphs with all the advantages of using a language specially designed for that purpose. Our extension currently is intended to integrate multiple data sets independently, but some use cases require joins or other operations during knowledge graph generation, which we will investigate in the future.
PDF
BibTeX +
@InProceedings{Delva2021Integratingnesteddata,
author = {Delva, Thomas and Van Assche, Dylan and Heyvaert, Pieter and De Meester, Ben and Dimou, Anastasia},
title = {{Integrating nested data into knowledge graphs with RML fields}},
year = {2021},
pdf = {https://openreview.net/pdf?id=9-uyJRiORwh},
abstract = {To support business decisions or improve operational efficiency, heterogeneous data is often integrated into a knowledge graph. This integration can be achieved with one of the existing declarative mapping languages, which offer declarative data integration in the form of knowledge graphs. However, current mapping languages cannot always integrate data with nested structure, such as JSON or XML files or JSON documents stored in a database column. We designed a backwards-compatible extension of the RDF Mapping Language (RML) which empowers it to integrate nested data: RML fields. In this paper, we introduce RML fields, compare it with the state of the art in mapping languages, and validate it on mapping challenges formulated by the Knowledge Graph Construction W3C community group. Our extension allows to address several of the challenges related to nested data that were previously not possible. RML fields can be used to integrate even more datasets into knowledge graphs with all the advantages of using a language specially designed for that purpose. Our extension currently is intended to integrate multiple data sets independently, but some use cases require joins or other operations during knowledge graph generation, which we will investigate in the future.},
}