SSIS:MatchUp:Matchcodes:Blank Field Mapping: Difference between revisions
Created page with "← SSIS Reference {| class="mw-collapsible" cellspacing="2" style="background-color:#f9f9f9; border:1px solid #aaaaaa; font-size:9pt; color:#0645ad; pa..." |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
{{SSISMatchUpNav | |||
|MatchcodesCollapse= | |||
{ | }} | ||
{ | |||
| | |||
Latest revision as of 00:19, 14 November 2015
← SSIS:Data Quality Components
Blank Field Matching
This needs a special discussion, as its importance is often overlooked. If this property is on, then the absence of data in both records would indicate a match. If this property is off, then two records with missing data, but matching in every other way, will not match. It would be reasonable to wonder if you would not want this behavior. However, it is undesirable in certain situations. Take the following matchcode (paying attention to the Blank column):
Component | Size | Blank | 1 | 2 |
---|---|---|---|---|
ZIP/PC | 5 | Yes | X | X |
Last Name | 5 | Yes | X | X |
Street # | 4 | Yes | X | |
Street Name | 4 | Yes | X | |
PO Box | 10 | Yes | X |
As described above, this produces the following combinations:
- Condition #1
- ZIP/PC, Last Name, Street Number, Street Name
- Condition #2
- ZIP/PC, Last Name, PO Box
For this example, take the following records:
Name: | Joe Smith | Suzi Smith |
---|---|---|
Address: | 326 Main Street | 405 Main St |
City/State/PC: | Pembroke MA 02066 | Pembroke, MA 02066 |
The following matchcode keys would be generated:
Cond# | ZIP/PC | Last Name | Street # | Street Name | PO Box |
---|---|---|---|---|---|
1 | 02066 | SMITH | 326 | MAIN | |
2 | 02066 | SMITH | 405 | MAIN |
According to these matchcode keys, it is clear that these two records do not satisfy condition #1. But because blank field matching is selected, they do satisfy condition #2. The Zip/PC, Last Name, and PO Box are exactly the same. Therefore, the two records do match.
Obviously, this is not the correct result. Making one change to the matchcode:
Component | Size | Blank | 1 | 2 |
---|---|---|---|---|
ZIP/PC | 5 | Yes | X | X |
Last Name | 5 | Yes | X | X |
Street # | 4 | Yes | X | |
Street Name | 4 | Yes | X | |
PO Box | 10 | No | X |
The same comparison is done for combination #2, but the match is disallowed this time because the matchcode now indicates that missing (blank) information is not allowed to figure in the matching condition.
Looking at another example (using the same matchcode):
Name: | Joe Smith | Suzi Smith |
---|---|---|
Address: | PO Box 123 | PO Box 456 |
City/State/PC: | Pembroke MA 02066 | Pembroke, MA 02066 |
This pairing produces the following matchcode keys:
Cond# | ZIP/PC | Last Name | Street # | Street Name | PO Box |
---|---|---|---|---|---|
1 | 02066 | SMITH | 123 | ||
2 | 02066 | SMITH | 456 |
This record has the same problem as before, but this time combination #1 is the cause.
An even better matchcode would be:
Component | Size | Blank | 1 | 2 |
---|---|---|---|---|
ZIP/PC | 5 | Yes | X | X |
Last Name | 5 | Yes | X | X |
Street # | 4 | No | X | |
Street Name | 4 | No | X | |
PO Box | 10 | No | X |
This is one matchcode that works well. There is one more possible tweak, however: Turn on Both Blank Fields for the Street # component. Occasionally, MatchUP Object may encounter records such as:
Name: | Joe Notarangello | Suzi Notarangello |
---|---|---|
Address: | Oceanfront Estates | Oceanfront Est. |
City/State/PC: | Pembroke, MA 02066 | Pembroke, MA 02066 |
This reflects a trend in up-scale neighborhoods, where neither street address has a Street # component, though it is very likely these records should match.
So this new improved matchcode will account for these situations:
Component | Size | Blank | 1 | 2 |
---|---|---|---|---|
ZIP/PC | 5 | Yes | X | X |
Last Name | 5 | Yes | X | X |
Street # | 4 | Yes | X | |
Street Name | 4 | No | X | |
PO Box | 10 | No | X |