MatchUp Object:Hybrid:Order of Operations
MatchUp Object Hybrid Interface Navigation | |||||
---|---|---|---|---|---|
Overview | |||||
Order of Operations | |||||
|
Using the Hybrid interface allows for greater flexibility than the other interfaces, as it gives you more control to handle storage and management of match keys.
Basic Steps
These are the basic steps of a typical implementation of the Hybrid interface.
- Initialize the Hybrid interface.
- After creating an instance of the Hybrid interface, point the object toward its supporting data file, select a matchcode to use, and initialize these files.
- Create field mappings.
- In order to build keys to compare, the Hybrid interface needs to know which types of data the program will be passing to the interface and in what order.
- Build a master list of keys.
- Each record must have a match key so the Hybrid interface can select a cluster of records or check for duplicates. This consists of passing the data used in record comparison from each record to the interface in the same order used when creating a field mapping. After passing the necessary fields (usually a small subset of the fields from each record) via the AddField function, the Hybrid interface uses this information to generate a match key.
- Build a match key for the new address record.
- Repeat the step above to create a match key for the record to be compared against the cluster.
- Build the cluster list.
- Cycle through the master key list, extract only those records where the first part of the match key equals the first part of the match key for the new record.
- Compare the match key to the cluster list.
- Loop through the cluster key file for any keys that match the new record. If it finds a match, the CompareKeys function indicates a match.
Pseudocode Implementation
This is a common implementation of the Hybrid interface using pseudocode for maximum clarity. Working sample programs in several programming languages can be found on the MatchUp Object install disc
Initialize the Hybrid interface
After creating an instance of the Hybrid interface, point the object toward its supporting data file, select a matchcode and key file to use, and initialize these files.
First, create a new instance of the Hybrid interface.
SET mu = NEW mdMUHybrid
In order to successfully initialize this new instance, point it toward its data files and supply a valid License Key. Also, select a matchcode, by name, before initializing.
CALL mu.SetLicenseString with LicenseString CALL mu.SetPathToMatchUpFiles with DataPath CALL mu.SetMatchcodeName with MatchcodeName
If all of the above have been set correctly, calling the InitializeDataFiles function should return a ProgramStatus value of ErrorNone. If it does not, call the GetInitializeErrorString function to determine the reason for the failure to initialize.
CALL mu.InitializeDataFiles RETURNING ProgramStatus IF ProgramStatus is not ErrorNone THEN CALL mu.GetInitializeErrorString RETURNING ErrorMsg Display ErrorMsg Exit Routine END IF
If the initialization was successful, call the following functions to display version and expiration information about the instance of MatchUp Object currently in use on the local computer.
PRINT "Confirming Initialization: " + mu.GetInitializeErrorString PRINT "Build Number: " + mu.GetBuildNumber PRINT "Database Date: " + mu.GetDatabaseDate PRINT "Database Expiration Date: " + mu.GetDatabaseExpirationDate PRINT "License Expiration Date: " + mu.GetLicenseExpirationDate
Create field mappings
Field mappings define which types of data the Hybrid interface is expecting. For example, a typical matchcode may include a five-digit ZIP Code, a last name, and a street address. The data coming in, however, may contain the city, state, and ZIP as a single character field and the person’s full name as a single field as well.
As long as MatchUp Object knows what kind of data is being passed to it, the object is smart enough to pull what it needs from the data supplied to it.
CALL mu.ClearMappings
After clearing any mappings from a previous use of the Hybrid interface, call the AddMapping function once for each field being considered.
CALL mu.AddMapping with mu.Zip9 CALL mu.AddMapping with mu.First CALL mu.AddMapping with mu.Last CALL mu.AddMapping with mu.Address
Create Master Key File
Unlike the Incremental and Read-Write interfaces, the Hybrid interface requires the developer to maintain a list of keys for the deduping operation. In this example, the keys are stored in a text file generated on the fly.
Open KeyFile as text file for writing
Each record is read from the database, converted to a match key and written to the text file.
FOR EACH Record in Database Read Zip9, FirstName, LastName, StreetAddress fields from database CALL mu.ClearFields CALL mu.AddField with Zip9 CALL mu.AddField with FirstName CALL mu.AddField with LastName CALL mu.AddField with StreetAddress CALL mu.BuildKey CALL mu.GetKey RETURNING Key Write Key to KeyFile NEXT Close KeyFile
Create the Match Key for the Input Data
The next step is to take the record that is to be checked and create a match key for it.
GET Zip9, FirstName, LastName, StreetAddress from data source CALL mu.ClearFields CALL mu.AddField with Zip9 CALL mu.AddField with FirstName CALL mu.AddField with LastName CALL mu.AddField with StreetAddress CALL mu.BuildKey CALL mu.GetKey RETURNING Key
Create the Cluster List
Use the key generated in the last step to select only those records where the first part of the match key matches the same part of the match key for the record to be checked. The size of the portion of the match key to be checked is determined by the GetClusterSize function.
CALL mu.GetKeySize RETURNING KeySize CALL mu.GetClusterSize RETURNING ClusterSize SET ClusterKey = Left part of Key, size = ClusterSize
ClusterKey is a string, with a length equalling ClusterSize, used to match the first part of the match key from the input record. Cycle through the key list and create a cluster of only those records that match the cluster key to a new text file.
Open KeyFile for reading FOR EACH Record in KeyFile Read MasterKey IF First ClusterSize characters of MasterKey = ClusterKey THEN ADD Record to Cluster END IF NEXT Close KeyFile
Check Input Record Against Cluster List
With the cluster list built, check the whole key for the input record against each line of the cluster list, using the CompareKeys function to determine if there was a match.
FOR EACH Record in Cluster Read MatchKey CALL mu.CompareKey with MasterKey, MatchKey RETURNING NoError IF NoError is True PRINT MasterKey matches MatchKey END IF NEXT