Pieter Heyvaert Publications PhD Blog Contact More

Integrating nested data into knowledge graphs with RML fields

Thomas DelvaDylan Van Assche

To support business decisions or improve operational efficiency, heterogeneous data is often integrated into a knowledge graph. This integration can be achieved with one of the existing declarative mapping languages, which offer declarative data integration in the form of knowledge graphs. However, current mapping languages cannot always integrate data with nested structure, such as JSON or XML files or JSON documents stored in a database column. We designed a backwards-compatible extension of the RDF Mapping Language (RML) which empowers it to integrate nested data: RML fields. In this paper, we introduce RML fields, compare it with the state of the art in mapping languages, and validate it on mapping challenges formulated by the Knowledge Graph Construction W3C community group. Our extension allows to address several of the challenges related to nested data that were previously not possible. RML fields can be used to integrate even more datasets into knowledge graphs with all the advantages of using a language specially designed for that purpose. Our extension currently is intended to integrate multiple data sets independently, but some use cases require joins or other operations during knowledge graph generation, which we will investigate in the future.