MatchUp Object:Read/Write Deduping
Read/Write deduping is usually used for processing entire lists. It works in a manner similar to the way that the MatchUp software products does. A calling program passes an entire list to the Read/Write deduping engine one record at a time. When the entire list has been passed, the calling program tells the API to process the records. Then, the calling program retrieves each record, along with additional deduplication information, from the Read/Write deduper.
Read/Write deduping consists of the following steps:
- One by one, the program sends a series of record data (ZIP/PC, Name, Address, etc.) to the MatchUp API.
- When completely done (1), the program sends a “process” command to the API.
- The program retrieves the results for each record with deduplication information.
Order of Output Records
The program will send records in a particular sequence, either in record (raw) order, or maybe in a more sophisticated manner (by ZIP/PC, record type, and so on). MatchUp Object will not return the records in the same order. By default, records are output in cluster order. This order will be loosely based on the matchcode. For example, if the matchcode has Zip5 as its first component, output records will be more or less sorted by ZIP Code (but the developer should not count on this). If the application called the SetGroupSorting function, records in the same dupe group will be adjacent. Otherwise, duplicate records may or may not be adjacent (though they usually are near each other).
If a certain sequence is important (for example, records ordered in the same sequence they were input), sort the results after MatchUp Object has processed the data.
A Read/Write deduping session is relatively short-lived. Although the actual action of reading and writing records may take time (hours or days), the process is strictly defined into three distinct steps. The key file does not persist beyond this point. Because of this, Read/Write deduping is not usually the choice for ongoing or online processes.
Because MatchUp Object does not read or write directly to the database, some mechanism must be provided so that the application can match each record back to the original data source. The SetUserInfo function allows the application to pass an unique identifier for each record.
Read/Write Order of Operations
Using the Read/Write deduper is pretty straight forward. This section will outline the basic steps and then show an example of the programming logic for a typical implementation of the Read/Write deduper.
- Initialize the Read/Write deduper.
- Create field mappings.
- Read the records from the database.
- Build a match key for each record.
- Write each match key to the key file.
- Process the keys.
- Loop through the records and read the deduping data for each one.