An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.
-
Updated
Sep 30, 2024 - Python
Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference.
An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.
WInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and result evaluation.
A Winner-Take-All Hashing-Based Unsupervised Model for Entity Resolution Problems. [B. Sc. Thesis]
🧱 blocking methods for entity resolution
Experimental code for author name and affiliation linking/disabmiguation
BibLinkCreator Java library
LIMES linking for data integration (D5.2)
Created by Halbert L. Dunn
Released 1946