Linking Graph Entities with Multiplicity and Provenance
Entity linking is a fundamental database problem with applicationsin data integration, data cleansing, information retrieval, knowledge fusion, and knowledge-base population. It is the task of accurately identifying multiple, differing, and possibly contradictingrepresentations of the same real-world entity in data. In this work,we propose an entity linking system capable of linking entitiesacross different databases and mentioned-entities extracted fromtext data. Our entity linking solution, called Certus, uses a graph model to represent the profiles of entities. The graph model is versatile, thus, it is capable of handling multiple values for an attributeor a relationship, as well as the provenance descriptions of thevalues. Provenance descriptions of a value provide the settings ofthe value, such as validity periods, sources, security requirements,etc. This paper presents the architecture for the entity linking system, the logical, physical, and indexing models used in the system,and the general linking process. Furthermore, we demonstrate theperformance of update operations of the physical storage modelswhen the system is implemented in two state-of-the-art databasemanagement systems, HBase and Postgres.
READ FULL TEXT