The Open Cybernetics & Systemics Journal

2014, 8 : 462-467
Published online 2014 December 31. DOI: 10.2174/1874110X01408010462
Publisher ID: TOCSJ-8-462

A Traceable Data Fusion Based on Data Provenance

Zhao Qiang , Zhang Yongxin , Wang Dequan and Ding Yanhui
School of Mathematical Sciences, Shandong Normal University, Jinan 250014, China.

ABSTRACT

Data fusion is a hot topic in data integration which at least includes the two stages: entity resolution and data conflict resolution. However, the existing fusion process is transparent and the fusion stages are isolated. So in this paper, we proposed a traceable data fusion mechanism based on data provenance which can trace the data sources of fusion results and the evolutionary process. The mechanism mainly targets forwards entity resolution and data conflict resolution stage. We represented the provenance of data origin using PI-CS which is more accurate because PI-CS can record the intermediate process of data evolution. In order to record the evolution process of data fusion, we proposed two transformation provenances: entity resolution provenance and data conflict resolution provenance which record respectively the evolution process of entity resolution and data conflict resolution. Finally, we give an example to validate the availability of the traceable mechanism for data fusion.

Keywords:

Data conflict resolution, data fusion, data provenance, entity resolution.