Pentaho:Contact Verify:Name
Jump to navigation
Jump to search
← Data Quality Components for Pentaho
Contact Verify Navigation | |||||||||
---|---|---|---|---|---|---|---|---|---|
Overview | |||||||||
Tutorial | |||||||||
| |||||||||
| |||||||||
Result Codes |
The Name tab configures the fields that will be used for the name parsing functionality of the Contact Verification Component. Existing field names can be selected using the drop-down boxes. New field names can be created by typing the name into the box.
Input Name
The Name Parse tab requires data from a single field in order to populate its output fields.
- Full Name
- Select or enter the field name that will contain the name information to be parsed. This field can contain one or two full names, such as “Mr. John Q. Smith, Jr.” or “John Q. and Mary S. Smith.” If you do not set a field for Full Name, the Component will not parse name information.
- Company Name
- Select the field name that contains the company name.
Output Components
The output section consists of two sets of return fields. The first name detected will be returned by the first set of fields. If a second name is detected, it will be returned by the second set of fields.
- Prefix
- This field returns any part of the name that precedes the given name, such as “Mr.,” “Ms.,” or “Dr.,” for each full name detected.
- First Name
- This field returns the given name for each first name detected. Contact Verify can attempt to correct misspelled first names. To enable this feature, see the Name Parse Options screen.
- Middle Name
- This field returns the middle names or initials for each full name detected.
- Last Name
- This field returns the family name for each full name detected.
- Suffix
- This field returns any part of the name that follows the family name, such as degrees (“MD” or “PhD”) and generational indicators (“IV” or “Jr.”), for each full name detected.
- Gender
- This field returns a gender indicator for each full name detected. Gender is based on the first name. See the Name Parse Options screen for how to adjust how the Component assigns gender to a name.
- Salutation
- This field returns a salutation constructed from the first full name detected. To control how this salutation is formatted, see the Name Parse Options screen.
- Standardized Company Name
- This field returns a standardized company name.
Name Parse Options
To access Name Parse Options, click the Name Parse Options button on the Name tab
- Correct Misspellings in First Name
- The Component uses a database of common given names to correct obvious misspellings. To enable this feature, check the box.
- Name Order Hint
- The Name Order Hint tells the Component in what order the name components will be found in the input full name, normal name order, last name first, or a mixture. The default is “Varying.” The options are:
Option | Description |
---|---|
DefinitelyFull | Name will always be treated as normal name order, regardless of formatting or punctuation. |
VeryLikelyFull | Name will be treated as normal name order unless inverse order is clearly indicated by formatting or punctuation. |
ProbablyFull | If necessary, statistical logic will be employed to determine name order, with a bias toward normal name order. |
Varying | If necessary, statistical logic will be employed to determine name order, with no bias toward either name order. |
ProbablyInverse | If necessary, statistical logic will be employed to determine name order, with a bias toward inverse name order. |
VeryLikelyInverse | Name will be treated as inverse name order unless normal order is clearly indicated by formatting or punctuation. |
DefinitelyInverse | Name will always be treated as inverse name order, regardless of formatting or punctuation. |
MixedFirstName | Name field is expected to only contain prefixes, first, and middle names. |
MixedLastName | Name field is expected to only contain last names and suffixes. |
- Gender Aggression
- This option controls how the CVC assigns gender to a name, based on the first name. First names are rated on a 1 to 7 scale on the likelihood that they are a male or female name, with 7 being “always male” and 1 being “always female.” The Gender Aggression setting (Conservative, Neutral, or Aggressive) controls how the CVC treats names that fall between those two extreme
- Gender Population
- This option controls the gender assumed for the input data: predominantly female, predominantly male, or an even split. The effect of the Gender Aggression and Gender Population settings is shown on this chart.
Male | Female | ||||||
Aggression | Always [7] | Often [6] | Normally [5] | Neutral [4] | Normally [3] | Often [2] | Always [1] |
---|---|---|---|---|---|---|---|
Conservative | |||||||
Even | M | N | N | N | N | N | F |
Male | M | M | N | N | N | N | F |
Females | M | N | N | N | N | F | F |
Neutral | |||||||
Even | M | M | N | N | N | F | F |
Male | M | M | M | N | N | F | F |
Females | M | M | N | N | F | F | F |
Aggressive | |||||||
Even | M | M | M | N | F | F | F |
Male | M | M | M | M | N | F | F |
Females | M | M | N | F | F | F | F |
- Salutation Prefix
- The Component will begin every salutation with the text entered in this box. The default setting is “Dear.”
- Salutation Suffix
- The Component will end every salutation with the text entered in this box. The default setting is a semicolon.
- Salutation Slug
- The Component will use this text for a salutation if the input data did not contain enough information to construct a salutation from the parsed data. The default setting is “Dear Customer;.”
- Middle Name Logic
- The Component will parse the Middle Name depending on the method selected. The default setting is Parse Logic.
Option | Description |
---|---|
Parse Logic | In the absence of a hyphen, recognizable last names in the middle of a full name are treated as part of a hyphenated last name. |
Hyphenated Last | The middle word is assumed to be part of the last name. |
Middle Name | The middle word is assumed to be a middle name. |
- Salutation Preference
- Use this box to change the order of preference for salutation formats. The highest format will be used if possible, followed by the second, until all possibilities are exhausted. If you do not wish a format to be used, place it below the selection for “Blank.” To change the order of preference, select the items on the list and click the arrow buttons to move the selection up and down the list.