API:FAQ:Capabilities:Name Standardization

From Melissa Data Wiki
Revision as of 21:40, 12 June 2014 by Admin (talk | contribs) (Admin moved page FAQ:API Capabilities:Name Standardization to API:FAQ:Capabilities:Name Standardization without leaving a redirect)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Input Consolidation

Explain in detail how your tool consolidates several different input files with different file formats into single output file format.

The Melissa Data APIs are essentially blind to the data source – you control the file handling, allowing for the greatest flexibility to access (and process) data in any format or structure. Using native plugins for ETL tools such as SSIS or Pentaho Data Integration, the Data Quality Components allow cleansing of most file formats and outputting the format of choice.


Input Parsing

Explain in detail how your tool parses input names into pre name, first name, middle name, last name and post name and sends the data to individual fields.

When a Full Name or Multiple Names are passed into Name Object it first determines the gender of the first-names from a lookup table and then parses the specific name components as individual output properties using proprietary logic. Name Object will parse names into all the individual components. Incoming data can be in varied formats and patterns (Mr John Smith; Smith, John; Mr J L Smith Sr: etc., as well as dual names (Mr and Mrs John Smith). There are more than 190,00 first and last names from the US Census in the Name Object.


Name Record Splitting

Explain in detail how your tool splits combined names from one record into two separate records.

Name Object has a built-in list of words that connect names (“and,” “or,” “&,” etc.). It is the presence of these connectors that tell the object that a dual name may exist in the input full name string.


Name Record Splitting and Parsing

If a file contains a name field with a “Mr. and Mrs. Tom Jones,” explain how your tool splits these into two separate records (e.g. One record for Mr. Tom Jones and a second record for Mrs. Tom Jones).

The respective parsed names would be split into these properties, one set per name.

Results for name ‘Mr. and Mrs. Tom Jones’:

Prefix: Mr. Prefix2: Mrs.
First Name: Tom First Name2: Tom
Middle Name: Middle Name2:
Last Name: Jones Last Name2: Jones
Suffix: Suffix2:
Gender: M Gender2: F


Search and Replace Functionality

Explain in detail how your tool removes extraneous information from the input field data using search and replace functionality.

The Name Object will flag vulgarities that are detected in the name, and names that contain words found on the list of nuisance names (such as “Mickey Mouse”) There are no search and replace functionalities in Name Object.


Inconsistent Name Standardization

Explain in detail how your tool standardizes inconsistent firm names in the input record.

The MatchUp Object can compare company acronyms and full companies and strip out inconsistencies in the firms before comparisons are done. Name Object will recognize Company names inside a Name field and standardize them.


Free Formatted Data Fields

If a file contains a name and address in a “free formatted” data field, explain how your solution is able to identify and parse the name and address into separate data fields (e.g. Name and Address).

The RightFielder Object will parse and identify input text into the main components of an address, name, phone, and email. You can use RightFielder Object to parse customer input contact data as a block of text in any order and output it to specified properties. This allows extraction of distinct details from user input, such as Name, Address, City, State, Zip, Phone, Email, Department, etc. RightFielder Object can also be used to intelligently identify misfielded data.


Matching Standards and Alternative Spellings/Nicknames

Explain how your tool is able to provide match standards for the first and middle names, allowing us to overcome two types of matching problems such as alternative spellings and nicknames.

The MatchUp Object uses fuzzy matching techniques and a nickname table to recognize phonemes such as “ph” and “sh,” nicknames (Liz, Beth, Betty, Elizabeth), and alternate spellings of names (Gene, Jean, Jeanne). MatchUp Object can also handle nearly-exact strings of characters, such as “Lewis” vs. “Ewis,” and “Palacino” vs. “Al Pacino” as well as initials such as “John Smith” to “J Smith.” MatchUp Object allows you to use Nickname matching (Charles = Chuck) and ‘nearness’ algorithms (John=Jonh) when matching records. You can also determine whether to disregard middle names, so ‘John L Smith = John Smith,’ or to include them, so you can catch ‘J. Lee Smith = John L Smith.’