SSIS:MatchUp:Matchcodes:Blank Field Mapping

From Melissa Data Wiki
Revision as of 00:19, 14 November 2015 by Admin (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

← SSIS:Data Quality Components

MatchUp Navigation
Overview
Editions
Tutorial
Advanced Configuration
On-Premise
MatchUp Tabs
Matchcode
Field Mapping
Options
Golden Record
Custom Expression Elements
Survivorship/Pass-Through
Lookup Pass-Through Columns
Output Filter
Matchcode Editor
Matchcode Evaluation
Matchcode List
Component List
Component Properties
Algorithms
Matchcodes Overview
Component Combinations
Blank Field Mapping
Mapping
Optimization
Swap Matching
Result Codes
Result Codes



Blank Field Matching

This needs a special discussion, as its importance is often overlooked. If this property is on, then the absence of data in both records would indicate a match. If this property is off, then two records with missing data, but matching in every other way, will not match. It would be reasonable to wonder if you would not want this behavior. However, it is undesirable in certain situations. Take the following matchcode (paying attention to the Blank column):

Component Size Blank 1 2
ZIP/PC 5 Yes X X
Last Name 5 Yes X X
Street # 4 Yes X
Street Name 4 Yes X
PO Box 10 Yes X

As described above, this produces the following combinations:

  • Condition #1
ZIP/PC, Last Name, Street Number, Street Name
  • Condition #2
ZIP/PC, Last Name, PO Box

For this example, take the following records:

Name: Joe Smith Suzi Smith
Address: 326 Main Street 405 Main St
City/State/PC: Pembroke MA 02066 Pembroke, MA 02066

The following matchcode keys would be generated:

Cond# ZIP/PC Last Name Street # Street Name PO Box
1 02066 SMITH 326 MAIN
2 02066 SMITH 405 MAIN

According to these matchcode keys, it is clear that these two records do not satisfy condition #1. But because blank field matching is selected, they do satisfy condition #2. The Zip/PC, Last Name, and PO Box are exactly the same. Therefore, the two records do match.


Obviously, this is not the correct result. Making one change to the matchcode:

Component Size Blank 1 2
ZIP/PC 5 Yes X X
Last Name 5 Yes X X
Street # 4 Yes X
Street Name 4 Yes X
PO Box 10 No X

The same comparison is done for combination #2, but the match is disallowed this time because the matchcode now indicates that missing (blank) information is not allowed to figure in the matching condition.


Looking at another example (using the same matchcode):

Name: Joe Smith Suzi Smith
Address: PO Box 123 PO Box 456
City/State/PC: Pembroke MA 02066 Pembroke, MA 02066

This pairing produces the following matchcode keys:

Cond# ZIP/PC Last Name Street # Street Name PO Box
1 02066 SMITH 123
2 02066 SMITH 456

This record has the same problem as before, but this time combination #1 is the cause.


An even better matchcode would be:

Component Size Blank 1 2
ZIP/PC 5 Yes X X
Last Name 5 Yes X X
Street # 4 No X
Street Name 4 No X
PO Box 10 No X

This is one matchcode that works well. There is one more possible tweak, however: Turn on Both Blank Fields for the Street # component. Occasionally, MatchUP Object may encounter records such as:

Name: Joe Notarangello Suzi Notarangello
Address: Oceanfront Estates Oceanfront Est.
City/State/PC: Pembroke, MA 02066 Pembroke, MA 02066

This reflects a trend in up-scale neighborhoods, where neither street address has a Street # component, though it is very likely these records should match.


So this new improved matchcode will account for these situations:

Component Size Blank 1 2
ZIP/PC 5 Yes X X
Last Name 5 Yes X X
Street # 4 Yes X
Street Name 4 No X
PO Box 10 No X