Matchcode Optimization:Blank Matching

From Melissa Data Wiki
Jump to navigation Jump to search

← MatchUp Hub

Matchcode Optimization Navigation
Matchcode Optimization
First Component
Fuzzy Algorithms
Swap Matching
Blank Matching
Advanced Component Types
Algorithms
Accunear
Alphas
Consonants
Containment
Dice's Coefficient
Double Metaphone
Exact
Fast Near
Frequency
Frequency Near
Jaccard Similarity Coefficient
Jaro
Jaro-Winkler
Longest Common Substring (LCS)
MD Keyboard
Needleman-Wunsch
N-Gram
Numeric
Overlap Coefficient
Phonetex
Smith-Waterman-Gotoh
Soundex
UTF8 Near
Vowels


Blank Matching

Specifics

Summary

Sometimes it is desirable to create a matchcode that will prevent matches when two compared values are present and unique, but allow the match when one of the values is blank or an initial abbreviation. Enabling Short/Empty options determine blank value behavior.

Returns

A match if the value is the same, or if its blankness or initial satisfies the configured match conditions.

Example Matchcode Usage 1

Example Data 1

FIRST LAST RESULT
John Smith Match Found
J Smith Match Found
Smith Match Found
Mary Smith Unique


Example Matchcode Usage 2

Example Data 2

LAST ADDRESS RESULT
Smith 12 Main St apt 3 Match Found
Smith 12 Main St Match Found
Smith 12 Main St apt 2 Unique
Smith 12 Main St Match Found


In both of the above examples, the unmatched record could have been found as (and indeed could be) a match to one of the other ‘one-blank’ records. But the previously processed short record had already been placed into another duplicate group.


Performance
Slower Faster
Matches
More Matches Greater Accuracy


Recommended Usage

Hybrid deduper, where a single incoming record can quickly be evaluated independently against each record in an existing large master database.

Small batch runs where the actual number of blank values is minimal.

Not Recommended For

Batch processes where the number of records which may be grouped together and may contain short or blank values is great. The above examples demonstrate that MatchUp has to at some point make a decision as to which group a blank value really matches. By default, the order in which incoming records are processed determine whether a record will be added to an existing group or create a new dupe group.