19/10/2020

Sample driven data mapping for linked data and web APIs

Tobias Zeimetz, Ralf Schenkel

Keywords: relation alignment, schema mapping, data integration

Abstract: In order to create the most comprehensive RDF Knowledge Base possible, data integration is essential. Many different data sources are used to extend a given dataset or to correct errors in the data. Nowadays, Web APIs (instead of data dumps) are common external data sources, since many data providers make their data publicly available. However, the classic problems of data integration, i.e., which parts of the datasets can be mapped, remain. In addition, Web APIs are often more restrictive than data dumps and of course slower to access due to latencies and other constraints. In this paper we demonstrate the FiLiPo (Finding Linkage Points) system to automatically find connections (i.e., linkage points) between Web APIs and local Knowledge Bases in a reasonable amount of time. To this end, we developed a sample-driven schema matching system, which models Web API services as parameterized queries. These Web API services return a view definition of their data which subsequently need to be connected to the local database scheme. Furthermore, our approach is able to find valid input values for Web API services automatically (e.g. IDs) and can determine combined linkage points (e.g. first and last name) despite different structures. Our results on six real world API services with two local databases show that our linkage point detection algorithm performs well in terms of precision (0.89 up to 1.0) and recall (0.69 up to 1.0).

The video of this talk cannot be embedded. You can watch it here:
https://dl.acm.org/doi/10.1145/3340531.3417438#sec-supp
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at CIKM 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers