Name Object:Config Example: Difference between revisions
Jump to navigation
Jump to search
Created page with "; mdName.cfg – the NameObject configuration file ; If you’ve ever wanted to change the behavior of how the NameObject parses, genderizes, ; or creates salutations for..." |
No edit summary |
||
Line 1: | Line 1: | ||
; mdName.cfg – the NameObject configuration file | <nowiki/>; mdName.cfg – the NameObject configuration file | ||
; If you’ve ever wanted to change the behavior of how the NameObject parses, genderizes, | ; If you’ve ever wanted to change the behavior of how the NameObject parses, genderizes, |
Revision as of 16:41, 19 February 2014
; mdName.cfg – the NameObject configuration file
- If you’ve ever wanted to change the behavior of how the NameObject parses, genderizes,
- or creates salutations for certain names, you’ll need to understand how to edit the
- mdName.cfg file. This file is used to add, change or remove entries from the API’s
- stock name tables compiled into the distributed mdName.dat file.
- For detailed definitions and usage see the actual config file, or the documentation
- The content of this file can be used as the actual mdName.cfg file. Just save the
- unformatted text and rename to mdName.cfg. Alternately, you can just cut and paste
- the below examples into the actual file. Any line beginning with semi-colon is a
- comment, and has no effect on processing. The uncommented lines are actual
- examples of the respective name type.
- [Prefix] - List of name prefixes.
- Format is <Prefix>, <Sex>, <Dual Expansion>, <Case>
- Proprietary prefixes can cause names to be split incorrectly and name patterns to be
- misidentified.
- example – change ‘zm phil jackson’ to ‘Zen Master Phil Jackson’ (even though he isn’t)
[Prefix] zm,M,,Zen Master Mr and Mrs,,Mr ans Mrs,Mr ans Mrs
- [FirstName] - List of first names (used for name splitting, genderizing).
- Format is <First Name>, <Sex>, <Misspelling>, <Rank>, <Case>
- Adding entries in this section helps split and or case uncommon or international names
- that are new to existing census or database lists.
[FirstName] Timotee,7,x,,Timotee Deshawn,7,x,,DeShawn -HARDY
- [FirstNameFix] - List of misspelled first names and their corrections.
- Format is <Misspelling>, <Correction>
- Why not just make a spelling correction above, in the [FirstName] <Case> parameter ?
- Because, sometimes we want the FirstName additions to help in splitting, but aren’t sure of a
- name correction (maybe ‘Mr. Timotee Smith’ is his correctly spelled name).
- Setting the FirstNameSpellingCorrection property tells the NameObject to also use these
- entries to correct misspelled names
[FirstNameFix] Timotee,Timothy
- [LNPrefix] - List of last name prefixes
- Format is <Last Name Prefix>, <Case>
- This example will help identify the ‘Ze’ in ‘Frank Ze Bond’ as part of the last name,
- not a middle name
[LNPrefix] ze,Ze
- [LastName] - List of last names.
- Format is <Last Name>, <Rank>, <O-Name>, <Case>
- Adding entries here is useful for special casing Last Names. It can also be used to identify
- solitary “O’s” as an indicator of an Irish Last name. Now an example like “joe o jeep” is
- assumed you want to parse this name as “Joe O’Jeep” but “Joe Ojeep” should not be parsed
- as an Irish Name. If you wanted to add an Irish last name by flagging the solitary “O” and
- a concatenated string like “Joe O Spence” and “Joe Ospence” as “Joe O’Spence” add it as
- below…
[LastName] Legrandless,,,LeGrandless ojeep,,X,Ojeep ospence,,X,O’Spence
- [Suffix] - List of name suffixes.
- Format is <Suffix>, <Prefix>, <Salutation Remove>, <Dual Name Remove>, <Case>
- Chances are, with mostly full name formats, unrecognized suffixes can get split
- into the Last Name component. By adding an entry here, we will now correctly split
- a record like ‘John Smith, Grand Poohbah
[Suffix] grand poohbah,GrP,,,Grand PoohbaH
- [DualIndicator] - List of dual name connectors.
- Format is <Dual Name Connector>, <Delete>
- the practical example ‘Trustee for’ is already in the distributed data file, so a less probable
- example
- ‘john smith married susan jones’
[DualIndicator] married
- [Suspect] - List of suspicious words & phrases.
- Format is <Word/Phrase>, <Indicator>
- these words still get parsed, but the error code will identify them as vulgar, a company identifier ; or suspect. There may even be a pre-existing entry which you may later determine to be
- a real name. Example
- my new boss is ‘Fred Scat’. Ouch.
- NOTE
- no <case> parameter for this table
[Suspect]
frakkin,V shoes,C joe the plumber,S -scat ABC,C ZZX,C DUZ,C
- When an input name is flagged with a [Suspect] company indicator, you may choose
- to pass that input into the StandardizeCompany method. The following two table
- overrides allow you to apply special casing to the returned company.
- [Acronym] - These entries (4 letters or less) are NOT Acronyms and will be proper cased
- when passed through the StandardizeCompany method
- Format is <Lookup>
- <Lookup> = A short word that you do not want uppercased like an Acronym
- example. The following may actually represent a company name, not an acronym
- like ‘Duz Brothers Inc’, so we don’t want it all capitalized
[Acronym] DUZ
- [Company] - Words and phrases from company names that do not follow common casing rules.
- Format is <Company>, <Case>
- <Company> = The lookup word which requires special casing
- <Case> = The way this lookup word should be cased
- example
- These entries should be identified as Companies in the [Suspect] section (see above)
- When the StandardizeCompany method is called, the following substitutions should be made
- when the identified company is actually ‘ABC ZZx’, not ‘Abc Zzx’
[Company] ABC,ABC ZZX,ZZx
- [DualPattern] - List of dual name patterns.
- This one is much more advanced than the others, and should not be edited without
- contacting support. While editing the other above entries would affect that particular word,
- editing here could negatively affect your entire process.
- Format is <Pattern>, <Counts>, <Name Types>, <Split Type>
- P?&P?,> >,1,1 already exists and helps split ‘Mr. Smithhh and Mrs. Smithhh’
- or ‘Mr. Johnnn Smithhh & Dr. Maryy Lynne Smithhhh’
- ?F&?PF,,6,2 already exists and helps split ‘Smithhhh, John and Dr. Mary’
- Although you may not find the examples here impractical, test them out on sample data
- to see how this alternate config file changes NameObject results. And if you ever come up with
- common edits we have over-looked, please let us know, we are always trying to make the
- API even more accurate.