MatchUp Object:Matchcode Mapping: Difference between revisions

From Melissa Data Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[MatchUp Object|← MatchUp Object]]
{{MatchUpObjectMatchcodesNav
|MatchcodeCollapse=
}}


{{CustomTOC}}
{{CustomTOC}}


Matchcodes deal with the abstract. The components in a matchcode represent specific
Matchcodes deal with the abstract. The components in a matchcode represent specific types of data, but they aren't directly linked to the fields in databases. Mapping creates the link between the data and the matchcode.
types of data, but they aren't directly linked to the fields in databases. Mapping creates the
 
link between the data and the matchcode.
For example, take the following matchcode:
For example, take the following matchcode:


Line 19: Line 20:
|Company||10||No||X
|Company||10||No||X
|}
|}


Add a database which contains the following fields:
Add a database which contains the following fields:


{|cellspacing="0"
{|cellspacing="0"
Line 37: Line 36:
|}
|}


<pre>
NAME Contains full names (“Mr. John Smith”).
COMPANY Contains company names (“Melissa Data”).
ADD1 Contains first (primary) address line (“22382 Avenida Empresa”).
ADD2 Contains second (secondary) address line (“Suite 34”).
CSZ Contains City/State/Zip (“Rancho Santa Margarita, CA 92688”).
</pre>


An application must create a link between a database’s fields (Name, Company, Add1, Add2 and CSZ) and the matchcode components (Zip5, Last Name, First Name, Company).
An application must create a link between a database’s fields (Name, Company, Add1, Add2 and CSZ) and the matchcode components (Zip5, Last Name, First Name, Company).
Line 55: Line 61:
|Company||COMPANY||Company
|Company||COMPANY||Company
|}
|}


This mapping tells MatchUp that the 5-digit ZIP Code information is in a field named “CSZ” which is described as a field containing city, state, and ZIP Code information. The Last Name can be found in a field called “NAME” and is described as a full name field (which is a full name sequenced: Pre, FN, MN, LN, Suf).
This mapping tells MatchUp that the 5-digit ZIP Code information is in a field named “CSZ” which is described as a field containing city, state, and ZIP Code information. The Last Name can be found in a field called “NAME” and is described as a full name field (which is a full name sequenced: Pre, FN, MN, LN, Suf).
Line 63: Line 68:
Matchcode mappings follow five rules:
Matchcode mappings follow five rules:
#For every Matchcode Component, the application must specify a mapping. The only exception is described in rule 2.
#For every Matchcode Component, the application must specify a mapping. The only exception is described in rule 2.
#Actual Address components names (such as Street Number, Street Pre-Directional, Street Name, Street Suffix, Street Post-Directional, PO Box, and Street Secondary) are not listed for mapping purposes. Instead, the names Address Line 1, Address Line 2, and Address Line 3 are used. The example below used four address components in the matchcode (Street #, Street Name, Street Secondary, PO Box). However, it only used two address lines.
#Actual Address components names (such as Street Number, Street Pre-Directional, Street Name, Street Suffix, Street Post-Directional, PO Box, and Street Secondary, and Global Components) are not listed for mapping purposes. Instead, the names Address Line 1 and Address Line 2 through Address Line 8 are used. The example below used four address components in the matchcode (Street #, Street Name, Street Secondary, PO Box). However, it only used two address lines.
#If a matchcode uses any address components, Address Lines 1-3 will be listed after all other components regardless of where the address component appears in the matchcode. In the following example, the address components are listed before company in the matchcode, but Address Lines 1-3 are listed at the end (after company).
#If a matchcode uses any address components, Address Lines 1-8 will be listed after all other components regardless of where the address component appears in the matchcode. In the following example, the address components are listed before company in the matchcode, but Address Lines 1-8 are listed at the end (after company).
#If a matchcode uses address components, Address Lines 1-3 will require at least one line to be mapped, but not all. If a database only has one address field, an application will only need to map Address 1 to that field. All other components must be mapped.
#If a matchcode uses address components, Address Lines 1-8 will require at least one line to be mapped, but not all. If a database only has one address field, an application will only need to map Address 1 to that field. All other components must be mapped.
#Address Lines should be mapped from the top down (Address Line 1, then 2, then 3).
#Address Lines should be mapped from the top down (Address Line 1, then 2 through 8).


Enhancing the matchcode in the previous example:
Enhancing the matchcode in the previous example:
Line 89: Line 94:
|Company||10||No||X||X
|Company||10||No||X||X
|}
|}


Again, MatchUp doesn’t use the individual address components. They are replaced with Address 1, Address 2, and Address 3. So, the application would use the following Matchcode Mapping:
Again, MatchUp doesn’t use the individual address components. They are replaced with Address 1, Address 2, and Address 3. So, the application would use the following Matchcode Mapping:
Line 121: Line 125:
All three of the MatchUp Object deduping interfaces (Incremental, Read/Write and Hybrid) have an AddMapping function. This is used to create mappings for the current instance of whatever deduper an application is using. For the last example above, call the function in the following way:
All three of the MatchUp Object deduping interfaces (Incremental, Read/Write and Hybrid) have an AddMapping function. This is used to create mappings for the current instance of whatever deduper an application is using. For the last example above, call the function in the following way:


<blockquote>
<pre>mu->ClearMapping();
<pre>mu->ClearMapping();
mu->AddMapping(mu->CityStZip);
mu->AddMapping(mu->CityStZip);
Line 129: Line 132:
mu->AddMapping(mu->Address);
mu->AddMapping(mu->Address);
mu->AddMapping(mu->Address);</pre>
mu->AddMapping(mu->Address);</pre>
</blockquote>


The value being passed to the function is an enumerated value of the type MatchcodeMapping.
The value being passed to the function is an enumerated value of the type MatchcodeMapping.
Line 136: Line 138:


==Changing Mappings==
==Changing Mappings==
It is possible to change mappings in the middle of a session if, for example, an application
It is possible to change mappings in the middle of a session if, for example, an application has to handle two databases with different data structures. Continuing with the example from above, assume that the second database has the following structure:
has to handle two databases with different data structures. Continuing with the example
from above, assume that the second database has the following structure:


{|class="alternate01" cellspacing="0"
{|class="alternate01" cellspacing="0"
Line 159: Line 159:




To use this mapping, the application would first have to call the ClearMappings function
To use this mapping, the application would first have to call the ClearMappings function to remove the existing mappings and call the AddMapping function again to configure the new mapping.
to remove the existing mappings and call the AddMapping function again to configure
the new mapping.


<blockquote>
<pre>mu->AddMapping(mu->CityStZip);
<pre>mu->AddMapping(mu->CityStZip);
mu->AddMapping(mu->LastName);
mu->AddMapping(mu->LastName);
Line 170: Line 167:
mu->AddMapping(mu->Address);
mu->AddMapping(mu->Address);
mu->AddMapping(mu->Address);</pre>
mu->AddMapping(mu->Address);</pre>
</blockquote>




[[Category:MatchUp Object]]
[[Category:MatchUp Object]]

Latest revision as of 18:06, 23 November 2015

← MatchUp Object Reference

MatchUp Object Matchcodes Navigation
Matchcodes
Component Properties
Component Combinations
Blank Field Matching
Matchcode Mapping
Optimizing Matchcodes
Swap Matching Uses



Matchcodes deal with the abstract. The components in a matchcode represent specific types of data, but they aren't directly linked to the fields in databases. Mapping creates the link between the data and the matchcode.

For example, take the following matchcode:

Component Size Fuzzy 1
Zip5 5 No X
Last Name 5 No X
First Name 5 No X
Company 10 No X

Add a database which contains the following fields:

NAME Contains full names (“Mr. John Smith”).
COMPANY Contains company names (“Melissa Data”).
ADD1 Contains first (primary) address line (“22382 Avenida Empresa”).
ADD2 Contains second (secondary) address line (“Suite 34”).
CSZ Contains City/State/Zip (“Rancho Santa Margarita, CA 92688”).
NAME		Contains full names (“Mr. John Smith”).
COMPANY		Contains company names (“Melissa Data”).
ADD1		Contains first (primary) address line (“22382 Avenida Empresa”).
ADD2		Contains second (secondary) address line (“Suite 34”).
CSZ		Contains City/State/Zip (“Rancho Santa Margarita, CA 92688”).

An application must create a link between a database’s fields (Name, Company, Add1, Add2 and CSZ) and the matchcode components (Zip5, Last Name, First Name, Company).

With the example above, it may appear that the application will have to contain extensive splitting routines. This is not the case. All that is necessary is to tell MatchUp what type of data is in a specific field and the format of that data.

In the example above, an application would use the following matchcode mapping:

Matchcode Component Database Field Matchcode Mapping
Zip5 CSZ CityStZip
Last Name NAME FullName
First Name NAME FullName
Company COMPANY Company

This mapping tells MatchUp that the 5-digit ZIP Code information is in a field named “CSZ” which is described as a field containing city, state, and ZIP Code information. The Last Name can be found in a field called “NAME” and is described as a full name field (which is a full name sequenced: Pre, FN, MN, LN, Suf).

Matchcode Mapping Rules

Matchcode mappings follow five rules:

  1. For every Matchcode Component, the application must specify a mapping. The only exception is described in rule 2.
  2. Actual Address components names (such as Street Number, Street Pre-Directional, Street Name, Street Suffix, Street Post-Directional, PO Box, and Street Secondary, and Global Components) are not listed for mapping purposes. Instead, the names Address Line 1 and Address Line 2 through Address Line 8 are used. The example below used four address components in the matchcode (Street #, Street Name, Street Secondary, PO Box). However, it only used two address lines.
  3. If a matchcode uses any address components, Address Lines 1-8 will be listed after all other components regardless of where the address component appears in the matchcode. In the following example, the address components are listed before company in the matchcode, but Address Lines 1-8 are listed at the end (after company).
  4. If a matchcode uses address components, Address Lines 1-8 will require at least one line to be mapped, but not all. If a database only has one address field, an application will only need to map Address 1 to that field. All other components must be mapped.
  5. Address Lines should be mapped from the top down (Address Line 1, then 2 through 8).

Enhancing the matchcode in the previous example:

Component Size Fuzzy 1 2
Zip5 5 No X X
Last Name 5 No X X
First Name 5 No X X
Street # 5 No X
Street Name 5 No X
Street Secondary 12 No X
PO Box 10 No X
Company 10 No X X

Again, MatchUp doesn’t use the individual address components. They are replaced with Address 1, Address 2, and Address 3. So, the application would use the following Matchcode Mapping:

Matchcode Component Database Field Matchcode Mapping
Zip9 CSZ CityStZip
Last Name NAME FullName
First Name NAME FullName
Company COMPANY Company
Address Line 1 ADD1 Address
Address Line 2 ADD2 Address
Address Line 3 (none)

Note on Rule #1

If a database does not contain a field for information called for by a component in a matchcode, such as company field in the above example, then that matchcode should not be used to dedupe that database.

Use a different matchcode or modify an existing matchcode, as outlined later in this chapter.

However, if a matchcode calls for last name, for example, and the database only has full name, then simply map the full name field to the last name and MatchUp Object will handle parsing the field.

Matchcode Mapping Using the API

All three of the MatchUp Object deduping interfaces (Incremental, Read/Write and Hybrid) have an AddMapping function. This is used to create mappings for the current instance of whatever deduper an application is using. For the last example above, call the function in the following way:

mu->ClearMapping();
mu->AddMapping(mu->CityStZip);
mu->AddMapping(mu->FullName);
mu->AddMapping(mu->FullName);
mu->AddMapping(mu->Company);
mu->AddMapping(mu->Address);
mu->AddMapping(mu->Address);

The value being passed to the function is an enumerated value of the type MatchcodeMapping.

Note that this code does not tell MatchUp Object anything about the database containing the data to be deduped. The application handles the data access separately and then passes the necessary fields to the deduper using the AddField function.

Changing Mappings

It is possible to change mappings in the middle of a session if, for example, an application has to handle two databases with different data structures. Continuing with the example from above, assume that the second database has the following structure:

Matchcode Component Database Field Matchcode Mapping
Zip9 CSZ CityStZip
Last Name LAST LastName
First Name FIRST FirstName
Company COMPANY Company
Address Line 1 ADD1 Address
Address Line 2 ADD2 Address
Address Line 3 (none)


To use this mapping, the application would first have to call the ClearMappings function to remove the existing mappings and call the AddMapping function again to configure the new mapping.

mu->AddMapping(mu->CityStZip);
mu->AddMapping(mu->LastName);
mu->AddMapping(mu->FirstName);
mu->AddMapping(mu->Company);
mu->AddMapping(mu->Address);
mu->AddMapping(mu->Address);