Motivation

With the continuously increasing number of datasets published in the Web of Data and form part of the Linked Open Data Cloud, it becomes more and more essential to identify resources that correspond to the same real world object in order to interlink web resources and set the basis for large-scale data integration. This requirement becomes apparent in a multitude of domains ranging from science (marine research, biology, astronomy, pharmacology) to semantic publishing and cultural domains. In this context, instance matching (also referred to as record linkage [16], duplicate detection [3] entity resolution [2], and object identification in the context of databases [18]) is of crucial importance.

It is though essential at this point to develop, along with instance and entity matching systems, benchmarks to determine the weak and strong points of those systems, as well as their overall quality in order to support users in deciding the system to use for their needs. Hence, well defined, and good quality benchmarks are important for comparing the performance of the developed instance matching systems.

In this tutorial we aim at:

  • discussing the state-of-the-art instance matching benchmarks
  • presenting the benchmark design principles
  • providing an analysis of the performance results of instance matching systems for the presented benchmarks
  • presenting the research directions that should be exploited for the creation of novel benchmarks to answer the needs of the Linked Data paradigm.



Detailed Description

In this tutorial we will present the instance matching benchmarks that have been developed for Semantic Web data. A benchmark is, generally speaking, a set of tests against which the performance of a system is measured. Benchmarks are used not only to inform users of the strengths and weaknesses of systems, but also to encourage the technology vendors to deal with the drawbacks of their systems and to ameliorate their performance and functionality.

Our objective in this tutorial is to:

  • introduce the attendees to the principles of benchmark design for instance matching systems
  • discuss the dimensions of an instance matching benchmark
  • provide a comprehensive overview of the existing instance matching benchmarks with an analysis along the aforementioned dimensions
  • discuss the advantages and disadvantages of the existing benchmarks and the research directions that should be exploited for the creation of novel benchmarks to answer the needs of the Linked Data paradigm

Principles of Benchmark Design:

In this tutorial we will discuss the principles a benchmark should adhere to which have been introduced in the literature; based on the existing ones we will elaborate, in the concluding remarks of our tutorial, on an extended set of principles that can be used for designing new instance matching benchmarks. We will present the principles that Jim Gray proposed in his book The Benchmark Handbook for Database and Transaction Systems [10] and those presented by Euzenat and Shvaiko in their book Ontology Matching [15].

Classification Dimensions

In order to have a comprehensive instance matching benchmark analysis, we define a set of dimensions along which we are going to provide the overview of the instance matching benchmarks. The dimensions we consider are the constituents of an instance matching benchmark: the dataset(s), gold standards, test cases/workloads and performance metrics.

We will distinguish the benchmarks between those that consider real datasets and workloads and those that produce synthetic datasets and corresponding workloads. For both kinds of benchmarks we will discuss the ontologies employed, their characteristics, the interconnections between schemas and their instances. In the case of synthetic benchmarks we will in particular focus on the data generators defined by such benchmarks. We will discuss the data distributions that are followed by the generators and how these are obtained, i.e., if they are randomly generated or if reference datasets are used.

Gold standards are used to judge the completeness and soundness of the instance matching approach. We will discuss how these are created and the form in which they are published (sets of matched instances or mapping files). In both cases, the quality (in terms of completeness and correctness) of the gold standards will be discussed thoroughly in our tutorial.

The workloads an instance matching benchmark proposes take into account the heterogeneities that can be encountered in a data integration scenario. These are:

For each benchmark we will present which of the aforementioned heterogeneities it covers; specifically, in the case of synthetic ones, we will discuss the transformation logic employed to obtain datasets that address (parts of) the aforementioned heterogeneities. Last but not least, we will give an overview of the performance metrics that the existing benchmarks employ to quantify the performance of the instance matching systems.

Benchmark Overview

We will discuss and compare existing instance matching benchmarks according to the aforementioned directions in order to derive a full and complete assessment thereof.

We will analyze the well-known Ontology Alignment Evaluation Initiative OAEI that, at this point is the most popular framework for testing ontology, and specifically instance matching systems. Since 2005, OAEI organizes an annual campaign aiming at evaluating ontology matching solutions and technologies using a fixed set of benchmarks. In 2009, OAEI introduced the Instance Matching (IM) Track  that focuses on the evaluation of different instance matching techniques and tools for Linked Data. We will discuss the benchmarks proposed by the OAEI Instance Matching Tracks of 2009, 2010, 2011 and 2013, and we will present the performance of the systems that were evaluated with those benchmarks [4], [5], [14], [1], [9].

Apart from the OAEI benchmarks, other instance matching benchmarks have also been introduced; ONTOBI [21] is organized in 16 different test cases that take into account simple and complex transformations that are applied on the reference ontology. The application oriented benchmark that we will discuss in our tutorial is the Koninklijke Bibliotheek benchmark [13] that was proposed by the National Library of Netherlands and the Netherlands Institute for Sound and Vision. In addition to these benchmarks, we will describe instance matching benchmark generator SWING [8] that has been used in a number of OAEI IIMB benchmarks. Last but not least, other indivuadualy created benchmarks, like the Cora dataset [22][23] and others [24] will be discussed.

Audience

This tutorial is aimed at a broad range of attendants, ranging from senior undergraduate and graduate students to more experienced researchers who are unfamiliar with the existing instance matching benchmarks, to scientists, data producers and consumers, in general, whose applications require instance matching. We assume the audience will have a basic knowledge of the RDF [6], RDFS [11] and OWL [17] standard W3C languages and the SPARQL query language [20] along with their semantics.

References

[1] J. L. Aguirre, K. Eckert, A. F. J. Euzenat, W. R. van Hage, L. Hollink, C. Meilicke, A. N. D. Ritze, F. Scharffe, P. Shvaiko, O. Svab-Zamazal, C. Trojahn, E. Jimenez-Ruiz, B. C. Grau, and B. Zapilko. Results of the ontology alignment evaluation initiative 2012. In OM, 2012.

[2] I. Bhattacharya and L. Getoor. Entity resolution in graphs. Mining Graph Data. Wiley and Sons, 2006.

[3] A. K. Elmagarmid, P. Ipeirotis, and V. Verykios. Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge and Data Engineering, 19(1), 2007.

[4] J. Euzenat, A. Ferrara, L. Hollink, A. Isaac, C. Joslyn, V. Malaise, C. Meilicken, A. Nikolov, J. Pane, M. Sabou, F. Scharffe, P. Shvaiko, V. S. H., Stuckenschmidt, O. Svab-Zamazal, V. Svatek, , C. Trojahn, G. Vouros, and S. Wang. Results of the Ontology Alignment Evaluation Initiative 2009. In OM, 2009.

[5] J. Euzenat, A. Ferrara, C. Meilicke, J. Pane, F. Schar e, P. Shvaiko, H. Stuckenschmidt, O. Svab- Zamazal, V. Svatek, and C. Trojahn. Results of the Ontology Alignment Evaluation Initiative 2010. In OM, 2010.

[6] B. M. F. Manola, E. Miller. RDF Primer. www.w3.org/TR/rdf-primer, February 2004.

[7] A. Ferrara, D. Lorusso, S. Montanelli, and G. Varese. Towards a Benchmark for Instance Matching. In OM, 2008.

[8] [ A. Ferrara, S. Montanelli, J. Noessner, and H. Stuckenschmidt. Benchmarking Matching Applications on the Semantic Web. In ESWC, 2011.

[9] B. C. Grau, Z. Dragisic, K. Eckert, A. F. J. Euzenat, R. Granada, V. Ivanova, E. Jimenez-Ruiz, A. O. Kempf, P. Lambrix, A. Nikolov, H. Paulheim, D. Ritze, F. Schar e, P. Shvaiko, C. Trojahn, and O. Zamazal. Results of the ontology alignment evaluation initiative 2013. In OM, 2013.

[10] J. Gray, editor. The Benchmark Handbook for Database and Transaction Systems. Morgan Kaufmann, 1993.

[11] P. Hayes. RDF Semantics. www.w3.org/TR/rdf-mt, February 2004.

[13] A. Isaac, L. van der Meij, S. Schlobach, and S. Wang. An Empirical Study of Instance-Based Ontology Matching. In ISWC/ASWC, 2007.

[14] A. F. J. Euzenat, W. R. van Hage, L. Hollink, C. Meilicke, A. N. D. Ritze, F. Scharffe, P. Shvaiko, H. Stuckenschmidt, O. Svab-Zamazal, and C. Trojahn. Results of the Ontology Alignment Evaluation Initiative 2011. In OM, 2011.

[15] J.Euzenat and P. Shvaiko, editors. Ontology Matching. Springer-Verlag, 2007.

[16] C. Li, L. Jin, and S. Mehrotra. Supporting ecient record linkage for large data sets using mapping techniques. In WWW, 2006.

[17] D. L. McGuinness and F. van Harmelen. OWL Web Ontology Language. http://www.w3.org/TR/owl-features/, 2004.

[18] J. Noessner, M. Niepert, C. Meilicke, and H. Stuckenschmidt. Leveraging Terminological Structure for Object Reconciliation. In ESWC, 2010.

[20] E. Prud'hommeaux and A. Seaborne. SPARQL Query Language for RDF. www.w3.org/TR/rdfsparql- query, January 2008.

[21] K. Zaiss, S. Conrad, and S. Vater. A Benchmark for Testing Instance-Based Ontology Matching Methods. In KMIS, 2010.

[22] R. Isele and C. Bizer. Learning linkage rules using genetic programming. In OM, 2011.

[23] A. Nikolov, V. Uren, E. Motta, and A. de Roeck. Refining instance coreferencing results using belief propagation. In ASWC, 2008.

[24] A. Jentzsch, J. Zhao, O. Hassanzadeh, K.-H. Cheung, M. Samwald, and B. Andersson. Linking open drug data. In Linking Open Data Triplification Challenge, I-SEMANTICS, 2009.