SSIS:MatchUp:Matchcodes:Mapping
Matchcode Mapping
Matchcodes deal with the abstract. The components in a matchcode represent specific types of data, but they aren't directly linked to the fields in databases. Mapping creates the link between the data and the matchcode.
For example, take the following matchcode:
Component | Size | Fuzzy | 1 |
---|---|---|---|
ZIP5 | 5 | No | X |
Last Name | 5 | No | X |
First Name | 5 | No | X |
Company | 10 | No | X |
Add a database that contains the following fields:
NAME: | Contains full names ("Mr. John Smith"). |
---|---|
COMPANY: | Contains company names ("Melissa Data"). |
ADD1: | Contains first (primary) address line ("22382 Avenida Empresa"). |
ADD2: | Contains second (secondary) address line ("Suite 34"). |
CSZ: | Contains City/State/Zip ("Rancho Santa Margarita, CA 92688"). |
An application must create a link between a database's fields (Name, Company, Add1, Add2 and CSZ) and the matchcode components (Zip5, Last Name, First Name, Company).
With the example above, it may appear that the application will have to contain extensive splitting routines. This is not the case. All that is necessary is to tell MatchUp what type of data is in a specific field and the format of that data.
In the example above, an application would use the following matchcode mapping:
Component | Database Field | Matchcode Mapping |
---|---|---|
ZIP5 | CSZ | CityStZip |
Last Name | NAME | FullName |
First Name | NAME | FullName |
Company | COMPANY | Company |
This mapping tells MatchUP that the 5-digit ZIP Code information is in a field named "CSZ" which is described as a field containing city, state, and ZIP Code information. The Last Name can be found in a field called "NAME" and is described as a full name field (which is a full name sequenced: Pre, FN, MN, LN, Suf).
Five Mapping Rules
Matchcode mappings follow five rules:
- For every Matchcode Component, the application must specify a mapping. The only exception is described in rule 2.
- Actual Address components names (such as Street Number, Street Pre-Directional, Street Name, Street Suffix, Street Post-Directional, PO Box, and Street Secondary) are not listed for mapping purposes. Instead, the names Address Line 1, Address Line 2, and Address Line 3 are used. The example below used four address components in the matchcode (Street #, Street Name, Street Secondary, PO Box). However, it only used two address lines.
- If a matchcode uses any address components, Address Lines 1-3 will be listed after all other components regardless of where the address component appears in the matchcode. In the following example, the address components are listed before company in the matchcode, but Address Lines 1-2 are listed at the end (after company).
- If a matchcode uses address components, Address Lines 1-3 will require at least one line to be mapped, but not all. If a database only has one address field, an application will only need to map Address 1 to that field. All other components must be mapped.
- Address Lines should be mapped from the top down (Address Line1, then 2, then 3).
Enhancing the matchcode in the previous example:
Component | Size | Fuzzy | 1 | 2 |
---|---|---|---|---|
ZIP5 | 5 | No | X | |
Last Name | 5 | No | X | |
First Name | 5 | No | X | |
Street # | 5 | No | X | |
Street Name | 5 | No | X | |
Street Secondary | 12 | No | X | |
PO Box | 10 | No | X | |
Company | 10 | No | X | X |
Again, MatchUP doesn't use the individual address components. They are replaced with Address 1, Address 2, and Address 3. So, the application would use the following Matchcode Mapping:
Matchcode Component | Database Field | Matchcode Mapping |
---|---|---|
ZIP9 | CSZ | CityStZip |
Last Name | NAME | FullName |
First Name | NAME | FullName |
Company | COMPANY | Company |
Address Line 1 | ADD1 | Address |
Address Line 2 | ADD2 | Address |
Address Line 3 | (none) |
Note on Rule #1:
If a database does not contain a field for information called for by a component in a matchcode, such as company field in the above example, then that matchcode should not be used to dedupe that database.
Use a different matchcode or modify an existing matchcode, as outlined later in this chapter.
However, if a matchcode calls for last name, for example, and the database only has full name, then simply map the full name field to the last name and MatchUP Object will handle parsing the field.