Difference between revisions of "MatchUp Object:Matchcodes"

From Melissa Data Wiki
Jump to navigation Jump to search
(Created page with "{{MatchUpObjectMatchcodesNav |MatchcodeCollapse= }} {{CustomTOC}} ==Matchcode Overview== Category:MatchUp Object")
 
Line 5: Line 5:
{{CustomTOC}}
{{CustomTOC}}


==Matchcode Overview==
==Overview==
Matchcodes are sets of rules that tell MatchUp Object which data fields to consider when determining if two records are duplicates. MatchUp Object uses the matchcode to construct a “match key,” a simplified string of characters that represents the information within the record, enough to determine if the given record is unique or a duplicate.
 
===Matchcode Components===
Each matchcode consists of one or more components which are specific data types that enable you to tell MatchUp Object which fields to use by programmatically mapping the fields in the real database to these data types.
 
The matchcode component should match the data type that MatchUp Object needs to build the match key, not necessarily the format found in the database. In other words, if
the database contains full names (first and last) but only last names are needed for
deduping, the matchcode would use the last name component. At the programming stage,
where fields are mapped to specific components, the full name field would be mapped to
the last name component. MatchUp Object is smart enough to parse the name and use
only the information it needs.
 
The following table lists all of the available matchcode components (Data Types) in MatchUp Object.
 
{|class="alternate01" cellspacing="0"
!Component
!Description
|-
|Prefix
|Prefix of a personal name (Mr, Mrs, Ms, Dr).
|-
|First Name
|A first name.
|-
|Middle Name
|A middle name.
|-
|Last Name
| A last name.
|-
|Suffix
|A suffix from a personal name.
|-
|Gender
|Male/Female/Neutral.
|-
|First/Nickname
|A representative nickname for a first name.
|-
|Middle/Nickname
|A representative nickname for a middle name.
|-
|Department/Title
|A title and/or department name.
 
Frequently these components don't match exactly because of ‘noise words’ such as “the,” “and,” “agency,” and so on. MatchUp strips these words from these components.
|-
|Company
|A company name.
 
Frequently these components don't match exactly because of ‘noise words’ such as “the,” “and,” “agency,” and so on. MatchUp strips these words from these components.
|-
|Company Acronym
|A company's acronym.
 
MatchUp Object converts any multi-word company name into an acronym (for example, “International Business Machines” is squeezed into “IBM”). Single-word company names are left as they are. This conversion is done after noise words are removed.
|-
|Street Number
|The street number from an address line.
 
The seven street address components (Street Number, Street Pre-Directional, Street Name, Street Suffix, Street Post-Directional, PO Box, Street Secondary) are obtained by splitting up to three address lines. Note that PO Box and/or Street Secondary do not have to appear on their own line, or in a particular field. MatchUp's proprietary “street smart” splitter does all of the work.
|-
|Street Pre-Directional
|“South” in “3 South Main St”.
|-
|Street Name
|The street name from an address line.
|-
|Street Suffix
|An address suffix (St, Ave, Blvd).
|-
|Street Post-Directional
|“North” in “3 Main St North”.
|-
|PO Box
|PO Boxes also include Farm Routes, Rural Routes, etc.
|-
|Street Secondary
|Apartments, floors, rooms, etc.
|-
|Address
|A single unparsed address line.
 
When using the Full Address component, you are at the mercy of every little deviation in data entry. Because MatchUp Object’s street splitter is so powerful, it is preferable to use street address components instead of the Full Address in nearly all cases. The only exception may be when processing foreign addresses that don’t conform very well to US, Canadian or UK addressing formats.
|-
|City
|A city name. ZIP or Postal code is usually more accurate.
|-
|State/Province
|A state or province name.
|-
|Zip9
|A full ZIP + 4® code (9 digits).
 
MatchUp Object removes dashes and spaces from ZIP codes. When processing a mix of Canadian Postal Codes and US ZIP codes, use the Zip9 component.
|-
|Zip5
|The ZIP Code (5 digits).
|-
|Zip4
|The +4 extension of a ZIP + 4 code (4 digits).
|-
|Postal Code
|(Canada) A Canadian Postal Code.
|-
|City (UK)
|A city in the United Kingdom.
|-
|County (UK)
|A county in the United Kingdom.
|-
|Postcode (UK)
|A United Kingdom Postcode.
|-
|Country
|A country.
|-
|Phone/Fax
|A phone number. MatchUp Object removes non-numeric characters from phone numbers. Leading ‘1-’ and trailing extensions are stripped if present. Numbers lacking an area code are right justified so that the local dialing code and number are aligned with numbers having area codes. If a data table often has missing or inaccurate area codes (i.e., after a recent area code split), start at the 4th position of the phone number component.
 
Do not use the right most 7 positions, as badly formatted extensions can sometimes cause the phone number to get coded improperly.
|-
|E-Mail Address
|An e-mail address. MatchUp Object removes illegal characters from e-mail addresses. Incomplete, changed, and commonly misspelled domain names are corrected using the Email Address data table.
|-
|Credit Card Number
|A credit card number.
|-
|Date
|A date. MatchUp Object allows you to specify a number of days for which a match will be possible if the records being compared fall within the set number of days apart.
|-
|Numeric
|This allows you to specify an integer number for which a match will be possible if the record’s unit difference falls within the set number.
|-
|Proximity
|Allows you to specify a maximum distance in miles between records in which a match will be possible.
 
The proximity component requires you to map in Latitude / Longitude coordinates (Not determined by MatchUp. Can be determined by a product such as GeoCoder or Contact Verify) allowing you to match addresses within a maximum distance setting for this component.
|-
|General
|Any general information. ID, birthday, SSN, etc.
|}




[[Category:MatchUp Object]]
[[Category:MatchUp Object]]

Revision as of 20:59, 22 July 2015

← MatchUp Object Reference

MatchUp Object Matchcodes Navigation
Matchcodes
Component Properties
Component Combinations
Blank Field Matching
Matchcode Mapping
Optimizing Matchcodes
Swap Matching Uses



Overview

Matchcodes are sets of rules that tell MatchUp Object which data fields to consider when determining if two records are duplicates. MatchUp Object uses the matchcode to construct a “match key,” a simplified string of characters that represents the information within the record, enough to determine if the given record is unique or a duplicate.

Matchcode Components

Each matchcode consists of one or more components which are specific data types that enable you to tell MatchUp Object which fields to use by programmatically mapping the fields in the real database to these data types.

The matchcode component should match the data type that MatchUp Object needs to build the match key, not necessarily the format found in the database. In other words, if the database contains full names (first and last) but only last names are needed for deduping, the matchcode would use the last name component. At the programming stage, where fields are mapped to specific components, the full name field would be mapped to the last name component. MatchUp Object is smart enough to parse the name and use only the information it needs.

The following table lists all of the available matchcode components (Data Types) in MatchUp Object.

Component Description
Prefix Prefix of a personal name (Mr, Mrs, Ms, Dr).
First Name A first name.
Middle Name A middle name.
Last Name A last name.
Suffix A suffix from a personal name.
Gender Male/Female/Neutral.
First/Nickname A representative nickname for a first name.
Middle/Nickname A representative nickname for a middle name.
Department/Title A title and/or department name.

Frequently these components don't match exactly because of ‘noise words’ such as “the,” “and,” “agency,” and so on. MatchUp strips these words from these components.

Company A company name.

Frequently these components don't match exactly because of ‘noise words’ such as “the,” “and,” “agency,” and so on. MatchUp strips these words from these components.

Company Acronym A company's acronym.

MatchUp Object converts any multi-word company name into an acronym (for example, “International Business Machines” is squeezed into “IBM”). Single-word company names are left as they are. This conversion is done after noise words are removed.

Street Number The street number from an address line.

The seven street address components (Street Number, Street Pre-Directional, Street Name, Street Suffix, Street Post-Directional, PO Box, Street Secondary) are obtained by splitting up to three address lines. Note that PO Box and/or Street Secondary do not have to appear on their own line, or in a particular field. MatchUp's proprietary “street smart” splitter does all of the work.

Street Pre-Directional “South” in “3 South Main St”.
Street Name The street name from an address line.
Street Suffix An address suffix (St, Ave, Blvd).
Street Post-Directional “North” in “3 Main St North”.
PO Box PO Boxes also include Farm Routes, Rural Routes, etc.
Street Secondary Apartments, floors, rooms, etc.
Address A single unparsed address line.

When using the Full Address component, you are at the mercy of every little deviation in data entry. Because MatchUp Object’s street splitter is so powerful, it is preferable to use street address components instead of the Full Address in nearly all cases. The only exception may be when processing foreign addresses that don’t conform very well to US, Canadian or UK addressing formats.

City A city name. ZIP or Postal code is usually more accurate.
State/Province A state or province name.
Zip9 A full ZIP + 4® code (9 digits).

MatchUp Object removes dashes and spaces from ZIP codes. When processing a mix of Canadian Postal Codes and US ZIP codes, use the Zip9 component.

Zip5 The ZIP Code (5 digits).
Zip4 The +4 extension of a ZIP + 4 code (4 digits).
Postal Code (Canada) A Canadian Postal Code.
City (UK) A city in the United Kingdom.
County (UK) A county in the United Kingdom.
Postcode (UK) A United Kingdom Postcode.
Country A country.
Phone/Fax A phone number. MatchUp Object removes non-numeric characters from phone numbers. Leading ‘1-’ and trailing extensions are stripped if present. Numbers lacking an area code are right justified so that the local dialing code and number are aligned with numbers having area codes. If a data table often has missing or inaccurate area codes (i.e., after a recent area code split), start at the 4th position of the phone number component.

Do not use the right most 7 positions, as badly formatted extensions can sometimes cause the phone number to get coded improperly.

E-Mail Address An e-mail address. MatchUp Object removes illegal characters from e-mail addresses. Incomplete, changed, and commonly misspelled domain names are corrected using the Email Address data table.
Credit Card Number A credit card number.
Date A date. MatchUp Object allows you to specify a number of days for which a match will be possible if the records being compared fall within the set number of days apart.
Numeric This allows you to specify an integer number for which a match will be possible if the record’s unit difference falls within the set number.
Proximity Allows you to specify a maximum distance in miles between records in which a match will be possible.

The proximity component requires you to map in Latitude / Longitude coordinates (Not determined by MatchUp. Can be determined by a product such as GeoCoder or Contact Verify) allowing you to match addresses within a maximum distance setting for this component.

General Any general information. ID, birthday, SSN, etc.