Pentaho:Contact Verify:Phone/Email

From Melissa Data Wiki
Jump to navigation Jump to search

← Data Quality Components for Pentaho

Contact Verify Navigation
Overview
Tutorial
Advanced Configuration
On-Premise
Web
Local Appliance
Contact Verify Tabs
Name
Address
GeoCode
Phone/Email
Pass-Through Columns
Output Filter
Processing Options
Reporting
Result Codes



Input Phone

This is where you map the input columns containing the original phone numbers.

Phone Number
The Phone input column requires a 10-digit phone number in a standard format.

Output Phone Components

Use these columns to map output columns for the geographical and parsed phone number data. Because of number portability, the geographic information may not reflect the actual location of the phone number’s owner for wireless or VOIP numbers.

Phone Number
The name of the output phone column.
Format
Select the format to be used for phone numbers in your data.
Area Code
This column returns the Area Code portion of the parsed phone number.
Prefix
This column returns the three-digit prefix portion of the parsed phone number.
Suffix
This column returns the four-digit suffix portion of the parsed phone number.
Extension
If the input phone number contained any extension information, that would be returned by this column.
Additional Output Columns
Click the Additional Output Columns... button to map columns for information beyond basic phone number parsing.

Input Email

Email Address
This string value must, at the minimum, contain the basic components of an email address: two strings of text separated by a “@” character.

Output Email Components

Email Address
This column returns the complete email address, standardized, and corrected according to the options selected in the Email Standardize Options.
Standardization Options & Additional Output Columns
Click the Standardization Options & Additional Output Columns... button to control how the CVC corrects and standardizes the email address and map the parsing and information columns.

Output Phone Columns

City
This column returns the city associated with the phone number's area code and prefix.
State/Province
This column returns the two-character state abbreviation associated with the phone number's area code and prefix.
County Name
This column returns the county name for the location associated with the phone number's area code and prefix.
County FIPS
This column returns the five-digit county FIPS code associated with the phone number's area code and prefix.
Country Code
This column returns the country code associated with the input phone number. This is the two-character abbreviation for the United States or Canada and not the numeric international dialing code.
Time Zone
This column returns the name of the time zone where the input area code and prefix are located.
Time Zone Code
This column returns a one- or two-digit number code for the time zone where the area code and prefix are located. The number also indicates the number of hours that the time zone is behind UTC/GMT. In other words, Eastern Standard Time has a time zone code of 5, indicating that the Eastern time zone is five hours behind UTC/GMT.
This number does not indicate differences due to daylight savings time.

Email Standardize Options

Standardization Options

Standardize Casing
If this box is checked, the Component will reset the input email address to all lowercase letters. For example, “JSmith@MelissaData.com” would become “jsmith@melissadata.com.”
Correct Email Syntax
If this box is checked, the Component will do the following:
  • Remove any illegal characters from the address. This would include excess “@” characters.
  • Correct misspelled domain names. For example, “yaho.com” would be replaced by “yahoo.com.”
  • Correct misspelled top-level domain names. For example, “.con” would be replaced with “.com.”
Perform Web Service Lookup
If this box is checked, the Component will attempt to validate the input email address by locating the domain from a compiled and continuously updated list of valid domains. This is slower than a database lookup but potentially more accurate if the domain name is either obscure, new, or no longer valid.
Perform Database Lookup
If this box is checked, the domain name is checked against the Email Object’s local database of known valid and invalid domain names. This is faster but may not include recently registered domains.
Perform Fuzzy Lookup
If this box is checked, the Component will attempt to validate the input email address by applying fuzzy matching algorithms to the input domain. This is slower than database lookup but potentially more accurate if the domain name contains a common or transposed typo.
Update Domains
If this box is checked, the Component will attempt to update the domain name of the email address. One domain name can replace another in cases such as a change in corporate ownership. For example, the domain of subscribers to the @Home cable Internet service was switched from “home.com” to “cox.net.”
Perform DNS Lookup
If this box is checked, the Component will attempt to validate the input email address by locating an MX (Mail Exchange) record or an A (Address Name) record for the domain on a DNS server. This is slower than a database lookup but potentially more accurate if the domain name is either obscure or new.

Output Columns

Mailbox Name
This column returns the portion of the email address that precedes the “@” character. For “ray@melissadata.com,” this column would return “ray.”
Domain Name
This column returns the domain name from the parsed email address, minus the top level domain. For “ray@melissadata.com,” this column would return “melissadata” (without the “.com”).
Top Level Domain
This column returns the top level domain (TLD) indicator from the input email address. For “ray@melissadata.com,” this would return the “dot com” portion.
Top Level Domain Description
This column returns the official text description associated with the top level domain. Not all TLDs have a description.