MatchUp Object:Hybrid:Order of Operations

From Melissa Data Wiki
Jump to navigation Jump to search

← MatchUp Object Reference

MatchUp Object Hybrid Interface Navigation
Overview
Order of Operations
Functions
Initialization
Mapping
Match Key
Comparison



Using the Hybrid interface allows for greater flexibility than the other interfaces, as it gives you more control to handle storage and management of match keys.

Basic Steps

These are the basic steps of a typical implementation of the Hybrid interface.

  1. Initialize the Hybrid interface.
    After creating an instance of the Hybrid interface, point the object toward its supporting data file, select a matchcode to use, and initialize these files.
  2. Create field mappings.
    In order to build keys to compare, the Hybrid interface needs to know which types of data the program will be passing to the interface and in what order.
  3. Build a master list of keys.
    Each record must have a match key so the Hybrid interface can select a cluster of records or check for duplicates. This consists of passing the data used in record comparison from each record to the interface in the same order used when creating a field mapping. After passing the necessary fields (usually a small subset of the fields from each record) via the AddField function, the Hybrid interface uses this information to generate a match key.
  4. Build a match key for the new address record.
    Repeat the step above to create a match key for the record to be compared against the cluster.
  5. Build the cluster list.
    Cycle through the master key list, extract only those records where the first part of the match key equals the first part of the match key for the new record.
  6. Compare the match key to the cluster list.
    Loop through the cluster key file for any keys that match the new record. If it finds a match, the CompareKeys function indicates a match.

Pseudocode Implementation

This is a common implementation of the Hybrid interface using pseudocode for maximum clarity. Working sample programs in several programming languages can be found on the MatchUp Object install disc

Initialize the Hybrid interface

After creating an instance of the Hybrid interface, point the object toward its supporting data file, select a matchcode and key file to use, and initialize these files.

First, create a new instance of the Hybrid interface.

SET mu = NEW mdMUHybrid

In order to successfully initialize this new instance, point it toward its data files and supply a valid License Key. Also, select a matchcode, by name, before initializing.

CALL mu.SetLicenseString with LicenseString
CALL mu.SetPathToMatchUpFiles with DataPath
CALL mu.SetMatchcodeName with MatchcodeName

If all of the above have been set correctly, calling the InitializeDataFiles function should return a ProgramStatus value of ErrorNone. If it does not, call the GetInitializeErrorString function to determine the reason for the failure to initialize.

CALL mu.InitializeDataFiles RETURNING ProgramStatus

IF ProgramStatus is not ErrorNone THEN
  CALL mu.GetInitializeErrorString RETURNING ErrorMsg
  Display ErrorMsg
  Exit Routine
END IF

If the initialization was successful, call the following functions to display version and expiration information about the instance of MatchUp Object currently in use on the local computer.

PRINT "Confirming Initialization: " + mu.GetInitializeErrorString
PRINT "Build Number: " + mu.GetBuildNumber
PRINT "Database Date: " + mu.GetDatabaseDate
PRINT "Database Expiration Date: " + mu.GetDatabaseExpirationDate
PRINT "License Expiration Date: " + mu.GetLicenseExpirationDate

Create field mappings

Field mappings define which types of data the Hybrid interface is expecting. For example, a typical matchcode may include a five-digit ZIP Code, a last name, and a street address. The data coming in, however, may contain the city, state, and ZIP as a single character field and the person’s full name as a single field as well.

As long as MatchUp Object knows what kind of data is being passed to it, the object is smart enough to pull what it needs from the data supplied to it.

CALL mu.ClearMappings

After clearing any mappings from a previous use of the Hybrid interface, call the AddMapping function once for each field being considered.

CALL mu.AddMapping with mu.Zip9
CALL mu.AddMapping with mu.First
CALL mu.AddMapping with mu.Last
CALL mu.AddMapping with mu.Address

Create Master Key File

Unlike the Incremental and Read-Write interfaces, the Hybrid interface requires the developer to maintain a list of keys for the deduping operation. In this example, the keys are stored in a text file generated on the fly.

Open KeyFile as text file for writing

Each record is read from the database, converted to a match key and written to the text file.

FOR EACH Record in Database
  Read Zip9, FirstName, LastName, StreetAddress fields from database
  CALL mu.ClearFields
  CALL mu.AddField with Zip9
  CALL mu.AddField with FirstName
  CALL mu.AddField with LastName
  CALL mu.AddField with StreetAddress
  CALL mu.BuildKey
  CALL mu.GetKey RETURNING Key
  Write Key to KeyFile
NEXT

Close KeyFile

Create the Match Key for the Input Data

The next step is to take the record that is to be checked and create a match key for it.

GET Zip9, FirstName, LastName, StreetAddress from data source

CALL mu.ClearFields
CALL mu.AddField with Zip9
CALL mu.AddField with FirstName
CALL mu.AddField with LastName
CALL mu.AddField with StreetAddress
CALL mu.BuildKey
CALL mu.GetKey RETURNING Key

Create the Cluster List

Use the key generated in the last step to select only those records where the first part of the match key matches the same part of the match key for the record to be checked. The size of the portion of the match key to be checked is determined by the GetClusterSize function.

CALL mu.GetKeySize RETURNING KeySize
CALL mu.GetClusterSize RETURNING ClusterSize
SET ClusterKey = Left part of Key, size = ClusterSize

ClusterKey is a string, with a length equalling ClusterSize, used to match the first part of the match key from the input record. Cycle through the key list and create a cluster of only those records that match the cluster key to a new text file.

Open KeyFile for reading

FOR EACH Record in KeyFile
  Read MasterKey
  IF First ClusterSize characters of MasterKey = ClusterKey THEN
    ADD Record to Cluster
  END IF
NEXT

Close KeyFile

Check Input Record Against Cluster List

With the cluster list built, check the whole key for the input record against each line of the cluster list, using the CompareKeys function to determine if there was a match.

FOR EACH Record in Cluster
  Read MatchKey
  CALL mu.CompareKey with MasterKey, MatchKey RETURNING NoError

  IF NoError is True
    PRINT MasterKey matches MatchKey
  END IF
NEXT