MatchUp Object:Incremental Deduping: Difference between revisions
Created page with "<!--__FORCETOC__--> {{CustomTOC}} Incremental deduping is usually used for real-time data entry validation. For example, a call center data-entry system where an operator wou..." |
m Admin moved page Incremental Deduping to MatchUp Object:Incremental Deduping without leaving a redirect |
(No difference)
|
Latest revision as of 18:43, 25 November 2014
Incremental deduping is usually used for real-time data entry validation. For example, a call center data-entry system where an operator would like to determine whether or not the caller is an existing customer. At any time, a calling program can pass the incremental deduping engine the contents of a record; the engine will then report as to whether or not this record is a dupe, and if so, which record or records it matches.
Incremental deduping consists of the following steps:
- The program processes a record and sends the specific information (ZIP/PC, Name, Address, etc) to MatchUp Object.
- Based on previous records sent to the API, it reports whether or not the record from the first step matches any of these previous records.
- Optionally, the application can tell MatchUp Object to add this record to its database for consideration in future comparisons.
The Historical Database
The incremental deduping engine relies heavily on a historical database that it maintains. The lifetime of this database is as long as necessary (seconds, days, even years). This database is constructed and maintained by MatchUp Object, so it can determine whether or not an incoming record matches other records fairly quickly.
Multi-User/Multi-Thread Considerations
Incremental deduping is unique in that multiple users or multiple processes can access the same historical database simultaneously. The API maintains a locking system to ensure that competing processes don't collide. In order for two processes to work in this fashion, the initialization function for each process must specify the same historical database (a.k.a. “key file”).
Transaction-Based Processing
The Incremental deduper interface of MatchUp Object features the option of using transaction-based operations on the historical database. This enables an application to process multiple calls to the AddRecord function as one, speeding up processing of large lists.
Incremental Order of Operations
Using the Incremental deduper is pretty straightforward. This section will outline the basic steps and then show an example of the programming logic for a typical implementation of the Incremental deduper.
- Initialize the Incremental deduper.
- Create field mappings.
- Read the record from the data source.
- Build a match key for the incoming record.
- Compare the match key to the key file.
- Write new records to the key file.