RightFielder Object:Config Example: Difference between revisions

From Melissa Data Wiki
Jump to navigation Jump to search
Tim (talk | contribs)
Created page with "<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microso..."
 
Tim (talk | contribs)
No edit summary
Line 1: Line 1:
<html xmlns:v="urn:schemas-microsoft-com:vml"
mdRightFielder.cfg: Case Studies
xmlns:o="urn:schemas-microsoft-com:office:office"
The mdRightFielder.cfg file is a plain text file that users can use to tailor Right Fielder Object’s behavior to meet their specific needs. Generally, this is used when a user has input data with a specific quirk or characteristic that Right Fielder on its own can’t handle properly.
xmlns:w="urn:schemas-microsoft-com:office:word"
This file is used to override the default entries from the stock mdRightFielder lookup tables contained in the mdRightFielder.dat data file. By default, both mdRightFielder.dat and mdRightFielder.cfg are installed in…
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml"
C:\Program Files\Melissa DATA\DQT\Data or the respective Melissa Data data directory in UNIX type OS installations.  
xmlns="http://www.w3.org/TR/REC-html40">
For complete instructions of available tables and types which can be overridden, as well as syntax and examples, open the mdRightFielder.cfg in a text editor and follow the instructions.
 
There are 3 types of modifications that can be made in mdRightFielder.cfg:
<head>
Lookup Table– The addition or removal of words (and phrases) to the Object’s dictionaries. This essentially expands (or limits) Right Fielder’s vocabulary.
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
Regular Expression– The addition of regular expressions that are used to recognize specific character patterns (for example, phone numbers, e-mails, etc).
<meta name=ProgId content=Word.Document>
Pattern Table – The addition or removal of patterns of words and phrases. Words and phrases are first identified via Lookup Tables and assigned tokens (specified in the Lookup Table itself). Sequences of tokens (patterns) are matched to entries in this table and transformed into output data.
<meta name=Generator content="Microsoft Word 12">
<meta name=Originator content="Microsoft Word 12">
<link rel=File-List
href="mdRightFielder.cfg%20Case%20Studies_files/filelist.xml">
<!--[if gte mso 9]><xml>
<o:DocumentProperties>
  <o:Author>Marc Bernier</o:Author>
  <o:LastAuthor>Tim</o:LastAuthor>
  <o:Revision>2</o:Revision>
  <o:TotalTime>405</o:TotalTime>
  <o:Created>2014-02-24T18:01:00Z</o:Created>
  <o:LastSaved>2014-02-24T18:01:00Z</o:LastSaved>
  <o:Pages>5</o:Pages>
  <o:Words>1569</o:Words>
  <o:Characters>8948</o:Characters>
  <o:Company>Microsoft</o:Company>
  <o:Lines>74</o:Lines>
  <o:Paragraphs>20</o:Paragraphs>
  <o:CharactersWithSpaces>10497</o:CharactersWithSpaces>
  <o:Version>12.00</o:Version>
</o:DocumentProperties>
</xml><![endif]-->
<link rel=themeData
href="mdRightFielder.cfg%20Case%20Studies_files/themedata.thmx">
<link rel=colorSchemeMapping
href="mdRightFielder.cfg%20Case%20Studies_files/colorschememapping.xml">
<!--[if gte mso 9]><xml>
<w:WordDocument>
  <w:TrackMoves>false</w:TrackMoves>
  <w:TrackFormatting/>
  <w:PunctuationKerning/>
  <w:DrawingGridHorizontalSpacing>5.5 pt</w:DrawingGridHorizontalSpacing>
  <w:DisplayHorizontalDrawingGridEvery>2</w:DisplayHorizontalDrawingGridEvery>
  <w:ValidateAgainstSchemas/>
  <w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
  <w:IgnoreMixedContent>false</w:IgnoreMixedContent>
  <w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
  <w:DoNotPromoteQF/>
  <w:LidThemeOther>EN-US</w:LidThemeOther>
  <w:LidThemeAsian>X-NONE</w:LidThemeAsian>
  <w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
  <w:Compatibility>
  <w:BreakWrappedTables/>
  <w:SnapToGridInCell/>
  <w:WrapTextWithPunct/>
  <w:UseAsianBreakRules/>
  <w:DontGrowAutofit/>
  <w:SplitPgBreakAndParaMark/>
  <w:DontVertAlignCellWithSp/>
  <w:DontBreakConstrainedForcedTables/>
  <w:DontVertAlignInTxbx/>
  <w:Word11KerningPairs/>
  <w:CachedColBalance/>
  </w:Compatibility>
  <m:mathPr>
  <m:mathFont m:val="Cambria Math"/>
  <m:brkBin m:val="before"/>
  <m:brkBinSub m:val="&#45;-"/>
  <m:smallFrac m:val="off"/>
  <m:dispDef/>
  <m:lMargin m:val="0"/>
  <m:rMargin m:val="0"/>
  <m:defJc m:val="centerGroup"/>
  <m:wrapIndent m:val="1440"/>
  <m:intLim m:val="subSup"/>
  <m:naryLim m:val="undOvr"/>
  </m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
  DefSemiHidden="true" DefQFormat="false" DefPriority="99"
  LatentStyleCount="267">
  <w:LsdException Locked="false" Priority="0" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
  <w:LsdException Locked="false" Priority="9" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
  <w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
  <w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
  <w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
  <w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
  <w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
  <w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
  <w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
  <w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
  <w:LsdException Locked="false" Priority="39" Name="toc 1"/>
  <w:LsdException Locked="false" Priority="39" Name="toc 2"/>
  <w:LsdException Locked="false" Priority="39" Name="toc 3"/>
  <w:LsdException Locked="false" Priority="39" Name="toc 4"/>
  <w:LsdException Locked="false" Priority="39" Name="toc 5"/>
  <w:LsdException Locked="false" Priority="39" Name="toc 6"/>
  <w:LsdException Locked="false" Priority="39" Name="toc 7"/>
  <w:LsdException Locked="false" Priority="39" Name="toc 8"/>
  <w:LsdException Locked="false" Priority="39" Name="toc 9"/>
  <w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
  <w:LsdException Locked="false" Priority="10" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Title"/>
  <w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
  <w:LsdException Locked="false" Priority="11" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
  <w:LsdException Locked="false" Priority="22" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
  <w:LsdException Locked="false" Priority="20" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
  <w:LsdException Locked="false" Priority="59" SemiHidden="false"
  UnhideWhenUsed="false" Name="Table Grid"/>
  <w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
  <w:LsdException Locked="false" Priority="1" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
  <w:LsdException Locked="false" Priority="60" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Shading"/>
  <w:LsdException Locked="false" Priority="61" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light List"/>
  <w:LsdException Locked="false" Priority="62" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Grid"/>
  <w:LsdException Locked="false" Priority="63" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 1"/>
  <w:LsdException Locked="false" Priority="64" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 2"/>
  <w:LsdException Locked="false" Priority="65" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 1"/>
  <w:LsdException Locked="false" Priority="66" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 2"/>
  <w:LsdException Locked="false" Priority="67" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 1"/>
  <w:LsdException Locked="false" Priority="68" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 2"/>
  <w:LsdException Locked="false" Priority="69" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 3"/>
  <w:LsdException Locked="false" Priority="70" SemiHidden="false"
  UnhideWhenUsed="false" Name="Dark List"/>
  <w:LsdException Locked="false" Priority="71" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Shading"/>
  <w:LsdException Locked="false" Priority="72" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful List"/>
  <w:LsdException Locked="false" Priority="73" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Grid"/>
  <w:LsdException Locked="false" Priority="60" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
  <w:LsdException Locked="false" Priority="61" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light List Accent 1"/>
  <w:LsdException Locked="false" Priority="62" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
  <w:LsdException Locked="false" Priority="63" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
  <w:LsdException Locked="false" Priority="64" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
  <w:LsdException Locked="false" Priority="65" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
  <w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
  <w:LsdException Locked="false" Priority="34" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
  <w:LsdException Locked="false" Priority="29" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
  <w:LsdException Locked="false" Priority="30" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
  <w:LsdException Locked="false" Priority="66" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
  <w:LsdException Locked="false" Priority="67" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
  <w:LsdException Locked="false" Priority="68" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
  <w:LsdException Locked="false" Priority="69" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
  <w:LsdException Locked="false" Priority="70" SemiHidden="false"
  UnhideWhenUsed="false" Name="Dark List Accent 1"/>
  <w:LsdException Locked="false" Priority="71" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
  <w:LsdException Locked="false" Priority="72" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
  <w:LsdException Locked="false" Priority="73" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
  <w:LsdException Locked="false" Priority="60" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
  <w:LsdException Locked="false" Priority="61" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light List Accent 2"/>
  <w:LsdException Locked="false" Priority="62" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
  <w:LsdException Locked="false" Priority="63" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
  <w:LsdException Locked="false" Priority="64" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
  <w:LsdException Locked="false" Priority="65" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
  <w:LsdException Locked="false" Priority="66" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
  <w:LsdException Locked="false" Priority="67" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
  <w:LsdException Locked="false" Priority="68" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
  <w:LsdException Locked="false" Priority="69" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
  <w:LsdException Locked="false" Priority="70" SemiHidden="false"
  UnhideWhenUsed="false" Name="Dark List Accent 2"/>
  <w:LsdException Locked="false" Priority="71" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
  <w:LsdException Locked="false" Priority="72" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
  <w:LsdException Locked="false" Priority="73" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
  <w:LsdException Locked="false" Priority="60" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
  <w:LsdException Locked="false" Priority="61" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light List Accent 3"/>
  <w:LsdException Locked="false" Priority="62" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
  <w:LsdException Locked="false" Priority="63" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
  <w:LsdException Locked="false" Priority="64" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
  <w:LsdException Locked="false" Priority="65" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
  <w:LsdException Locked="false" Priority="66" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
  <w:LsdException Locked="false" Priority="67" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
  <w:LsdException Locked="false" Priority="68" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
  <w:LsdException Locked="false" Priority="69" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
  <w:LsdException Locked="false" Priority="70" SemiHidden="false"
  UnhideWhenUsed="false" Name="Dark List Accent 3"/>
  <w:LsdException Locked="false" Priority="71" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
  <w:LsdException Locked="false" Priority="72" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
  <w:LsdException Locked="false" Priority="73" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
  <w:LsdException Locked="false" Priority="60" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
  <w:LsdException Locked="false" Priority="61" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light List Accent 4"/>
  <w:LsdException Locked="false" Priority="62" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
  <w:LsdException Locked="false" Priority="63" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
  <w:LsdException Locked="false" Priority="64" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
  <w:LsdException Locked="false" Priority="65" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
  <w:LsdException Locked="false" Priority="66" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
  <w:LsdException Locked="false" Priority="67" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
  <w:LsdException Locked="false" Priority="68" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
  <w:LsdException Locked="false" Priority="69" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
  <w:LsdException Locked="false" Priority="70" SemiHidden="false"
  UnhideWhenUsed="false" Name="Dark List Accent 4"/>
  <w:LsdException Locked="false" Priority="71" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
  <w:LsdException Locked="false" Priority="72" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
  <w:LsdException Locked="false" Priority="73" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
  <w:LsdException Locked="false" Priority="60" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
  <w:LsdException Locked="false" Priority="61" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light List Accent 5"/>
  <w:LsdException Locked="false" Priority="62" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
  <w:LsdException Locked="false" Priority="63" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
  <w:LsdException Locked="false" Priority="64" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
  <w:LsdException Locked="false" Priority="65" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
  <w:LsdException Locked="false" Priority="66" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
  <w:LsdException Locked="false" Priority="67" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
  <w:LsdException Locked="false" Priority="68" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
  <w:LsdException Locked="false" Priority="69" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
  <w:LsdException Locked="false" Priority="70" SemiHidden="false"
  UnhideWhenUsed="false" Name="Dark List Accent 5"/>
  <w:LsdException Locked="false" Priority="71" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
  <w:LsdException Locked="false" Priority="72" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
  <w:LsdException Locked="false" Priority="73" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
  <w:LsdException Locked="false" Priority="60" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
  <w:LsdException Locked="false" Priority="61" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light List Accent 6"/>
  <w:LsdException Locked="false" Priority="62" SemiHidden="false"
  UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
  <w:LsdException Locked="false" Priority="63" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
  <w:LsdException Locked="false" Priority="64" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
  <w:LsdException Locked="false" Priority="65" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
  <w:LsdException Locked="false" Priority="66" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
  <w:LsdException Locked="false" Priority="67" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
  <w:LsdException Locked="false" Priority="68" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
  <w:LsdException Locked="false" Priority="69" SemiHidden="false"
  UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
  <w:LsdException Locked="false" Priority="70" SemiHidden="false"
  UnhideWhenUsed="false" Name="Dark List Accent 6"/>
  <w:LsdException Locked="false" Priority="71" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
  <w:LsdException Locked="false" Priority="72" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
  <w:LsdException Locked="false" Priority="73" SemiHidden="false"
  UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
  <w:LsdException Locked="false" Priority="19" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
  <w:LsdException Locked="false" Priority="21" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
  <w:LsdException Locked="false" Priority="31" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
  <w:LsdException Locked="false" Priority="32" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
  <w:LsdException Locked="false" Priority="33" SemiHidden="false"
  UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
  <w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
  <w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;
mso-font-charset:2;
mso-generic-font-family:auto;
mso-font-pitch:variable;
mso-font-signature:0 268435456 0 0 -2147483648 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;
mso-font-charset:1;
mso-generic-font-family:roman;
mso-font-format:other;
mso-font-pitch:variable;
mso-font-signature:0 0 0 0 0 0;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;
mso-font-charset:0;
mso-generic-font-family:swiss;
mso-font-pitch:variable;
mso-font-signature:-536870145 1073786111 1 0 415 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-parent:"";
margin-top:0in;
margin-right:0in;
margin-bottom:10.0pt;
margin-left:0in;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
mso-themecolor:hyperlink;
text-decoration:underline;
text-underline:single;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-noshow:yes;
mso-style-priority:99;
color:purple;
mso-themecolor:followedhyperlink;
text-decoration:underline;
text-underline:single;}
p.MsoNoSpacing, li.MsoNoSpacing, div.MsoNoSpacing
{mso-style-priority:1;
mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-parent:"";
margin:0in;
margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
mso-style-unhide:no;
mso-style-qformat:yes;
margin-top:0in;
margin-right:0in;
margin-bottom:10.0pt;
margin-left:.5in;
mso-add-space:auto;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst
{mso-style-priority:34;
mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-type:export-only;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
mso-add-space:auto;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle
{mso-style-priority:34;
mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-type:export-only;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
mso-add-space:auto;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast
{mso-style-priority:34;
mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-type:export-only;
margin-top:0in;
margin-right:0in;
margin-bottom:10.0pt;
margin-left:.5in;
mso-add-space:auto;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
.MsoChpDefault
{mso-style-type:export-only;
mso-default-props:yes;
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
.MsoPapDefault
{mso-style-type:export-only;
margin-bottom:10.0pt;
line-height:115%;}
@page WordSection1
{size:8.5in 11.0in;
margin:.5in .5in .5in .5in;
mso-header-margin:.5in;
mso-footer-margin:.5in;
mso-paper-source:0;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:2086217681;
mso-list-type:hybrid;
mso-list-template-ids:1400944592 67698689 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
@list l1
{mso-list-id:2092189663;
mso-list-type:hybrid;
mso-list-template-ids:-96460662 67698689 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l1:level1
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
-->
</style>
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin-top:0in;
mso-para-margin-right:0in;
mso-para-margin-bottom:10.0pt;
mso-para-margin-left:0in;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;}
table.MsoTableGrid
{mso-style-name:"Table Grid";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-priority:59;
mso-style-unhide:no;
border:solid windowtext 1.0pt;
mso-border-alt:solid windowtext .5pt;
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-border-insideh:.5pt solid windowtext;
mso-border-insidev:.5pt solid windowtext;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]--><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="7170"/>
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
  <o:idmap v:ext="edit" data="1"/>
</o:shapelayout></xml><![endif]-->
</head>
 
<body lang=EN-US link=blue vlink=purple style='tab-interval:.5in'>
 
<div class=WordSection1>
 
<p class=MsoNormal align=center style='text-align:center'><b style='mso-bidi-font-weight:
normal'>mdRightFielder.cfg: Case Studies<o:p></o:p></b></p>
 
<p class=MsoNormal>The mdRightFielder.cfg file is a plain text file that users
can use to tailor Right Fielder Object’s behavior to meet their specific needs.
Generally, this is used when a user has input data with a specific quirk or characteristic
that Right Fielder on its own can’t handle properly.</p>
 
<p class=MsoNormal>This file is used to override the<span
style='mso-spacerun:yes'>  </span>default entries from the stock mdRightFielder
lookup tables contained in the mdRightFielder.dat data file.<span
style='mso-spacerun:yes'>  </span>By default, both mdRightFielder.dat and mdRightFielder.cfg
are installed in…</p>
 
<p class=MsoNormal>C:\Program Files\Melissa DATA\DQT\Data or the respective
Melissa Data data directory in UNIX type OS installations. </p>
 
<p class=MsoNormal>For complete instructions of available tables and types
which can be overridden, as well as syntax and examples, open the
mdRightFielder.cfg in a text editor and follow the instructions.</p>
 
<p class=MsoNormal>There are 3 types of modifications that can be made in
mdRightFielder.cfg:</p>
 
<p class=MsoListParagraphCxSpFirst style='text-indent:-.25in;mso-list:l1 level1 lfo1'><![if !supportLists]><span
style='font-family:Symbol;mso-fareast-font-family:Symbol;mso-bidi-font-family:
Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]>Lookup Table– The addition or removal of words
(and phrases) to the Object’s dictionaries. This essentially expands (or
limits) Right Fielder’s vocabulary.</p>
 
<p class=MsoListParagraphCxSpMiddle style='text-indent:-.25in;mso-list:l1 level1 lfo1'><![if !supportLists]><span
style='font-family:Symbol;mso-fareast-font-family:Symbol;mso-bidi-font-family:
Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]>Regular Expression– The addition of regular
expressions that are used to recognize specific character patterns (for
example, phone numbers, e-mails, etc).</p>
 
<p class=MsoListParagraphCxSpLast style='text-indent:-.25in;mso-list:l1 level1 lfo1'><![if !supportLists]><span
style='font-family:Symbol;mso-fareast-font-family:Symbol;mso-bidi-font-family:
Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]>Pattern Table – The addition or removal of patterns
of words and phrases. Words and phrases are first identified via Lookup Tables
and assigned tokens (specified in the Lookup Table itself). Sequences of tokens
(patterns) are matched to entries in this table and transformed into output
data.</p>
 
<p class=MsoNormal><b style='mso-bidi-font-weight:normal'><span
style='font-size:14.0pt;line-height:115%'>Case Study 1: Lookup Tables<o:p></o:p></span></b></p>
 
<p class=MsoNormal>There are three lookup tables in Right Fielder Object:<span
style='mso-spacerun:yes'>  </span></p>
 
<p class=MsoListParagraphCxSpFirst style='text-indent:-.25in;mso-list:l0 level1 lfo2'><![if !supportLists]><span
style='font-family:Symbol;mso-fareast-font-family:Symbol;mso-bidi-font-family:
Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]>LeftToken – used to recognize words and phrases
that usually appear at the start of data (name, company, titles)</p>
 
<p class=MsoListParagraphCxSpMiddle style='text-indent:-.25in;mso-list:l0 level1 lfo2'><![if !supportLists]><span
style='font-family:Symbol;mso-fareast-font-family:Symbol;mso-bidi-font-family:
Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]>MiddleToken – used to recognize words and
phrases that usually appear at the middle of the data (addresses, apartments,
PO Boxes, etc)</p>
 
<p class=MsoListParagraphCxSpLast style='text-indent:-.25in;mso-list:l0 level1 lfo2'><![if !supportLists]><span
style='font-family:Symbol;mso-fareast-font-family:Symbol;mso-bidi-font-family:
Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]>RightToken – used to recognize words and phrases
that usually appear at the end<span style='mso-spacerun:yes'>  </span>of data
(city, state, country)</p>
 
<p class=MsoNormal>Say, for example, you are processing a list of car
dealerships and the company recognition isn’t working as well as you would
like. In evaluating the results, it appears that if Right Fielder knew a bit
more about car manufacturers, processing might be a lot more accurate. In this
case, you would modify the LeftToken table in this way:</p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>[LeftToken]<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>FORD,C<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>TOYOTA,C<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>CHEVY,C<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>NISSAN,C<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>KIA,C<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>HONDA,C<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>LINCOLN,C<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>ALFA ROMEO,C<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>MOTORS,C<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><o:p>&nbsp;</o:p></p>
 
<p class=MsoNormal>Note that it is not necessary for the entries to be sorted.
Also, these entries will override any existing dictionary entries (for example,
‘LINCOLN’ is by default a First Name indicator). The second field, containing
the ‘C’ indicates what kind of word is being described (it’s ‘token’). Only one
token can be used per entry. Each table has different tokens that can be used
in it, see the mdRightFielder.cfg for details.</p>
 
<p class=MsoNormal><b style='mso-bidi-font-weight:normal'><span
style='font-size:14.0pt;line-height:115%'>Case Study 2: Regular Expression for
a defined Data Type<o:p></o:p></span></b></p>
 
<p class=MsoNormal>Regular expression are used to recognize postal codes,
e-mail addresses, phone numbers and URLs. These are types of data that can’t
well be recognized using a dictionary, but the actual pattern of numbers,
letters and punctuation is very useful in identifying the data. For example,
Canadian Postal Codes always follow the pattern
alpha-digit-alpha-digit-alpha-digit. They are recognized with the regular
expression:</p>
 
<p class=MsoNormal><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;line-height:115%;font-family:"Courier New";color:blue'>(?&lt;=^|
)</span></b><b style='mso-bidi-font-weight:normal'><span style='font-size:12.0pt;
line-height:115%;font-family:"Courier New"'>[a-z][0-9][a-z]<span
style='color:#00B050'>[- ]?[</span>0-9][a-z][0-9]<span style='color:blue'>(?=
|$)</span><o:p></o:p></span></b></p>
 
<p class=MsoNormal>Admittedly, regular expressions are a dark art and are not
very easy to understand. Our best advice is to start with simple expressions
and gradually add complexity, testing each addition before moving on. There are
a few web sites and tools that can greatly ease this trial and error process. I
use Rad Software’s Regular Expression Designer (http://www.radsoftware.com.au/regexdesigner/)
, but there are many others as well.</p>
 
<p class=MsoNormal>Say, for example, you have data that was run through OCR
software. Unfortunately, in many cases 0’s and 1’s were accidentally recognized
as O’s and l’s. We can enhance Right Fielder’s recognition of abominations such
as “Ol234” with this expression:</p>
 
<p class=MsoNormal><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;line-height:115%;font-family:"Courier New";color:blue'>(?&lt;=^|
)</span></b><b style='mso-bidi-font-weight:normal'><span style='font-size:12.0pt;
line-height:115%;font-family:"Courier New"'>[0-9Ol]{5}<span style='color:#00B050'>[-
]?([</span>0-9Ol]{4})?<span style='color:blue'>(?= |$)</span><o:p></o:p></span></b></p>
 
<p class=MsoNormal>This regular expression can be broken up into 5 parts:</p>
 
<table class=MsoTableGrid border=0 cellspacing=0 cellpadding=0
style='border-collapse:collapse;border:none;mso-yfti-tbllook:1184;mso-padding-alt:
0in 5.4pt 0in 5.4pt;mso-border-insideh:none;mso-border-insidev:none'>
<tr style='mso-yfti-irow:0;mso-yfti-firstrow:yes'>
  <td width=133 valign=top style='width:99.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='font-family:"Courier New";color:blue'>(?&lt;=^| )<o:p></o:p></span></p>
  </td>
  <td width=505 valign=top style='width:378.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'>This is a common preamble to most of our regular expressions. This
  indicates that there must be a break or delimiter of some sort preceding the
  zip code. This is used because we don’t want to accidentally recognize
  something that is actually at the tail of a longer string.</p>
  </td>
</tr>
<tr style='mso-yfti-irow:1'>
  <td width=133 valign=top style='width:99.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='font-family:"Courier New"'>[0-9Ol]{5}<o:p></o:p></span></p>
  </td>
  <td width=505 valign=top style='width:378.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'>This indicates that we need to see any number, an uppercase O or a
  lowercase l. And we need to see 5 of them in a row.</p>
  </td>
</tr>
<tr style='mso-yfti-irow:2'>
  <td width=133 valign=top style='width:99.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='font-family:"Courier New";color:#00B050'>[- ]?<o:p></o:p></span></p>
  </td>
  <td width=505 valign=top style='width:378.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'>This indicates that we might see a dash, a space (but we may not see
  either). The? quantifier means that we can see 0 or exactly 1 iterations of
  the character.</p>
  </td>
</tr>
<tr style='mso-yfti-irow:3'>
  <td width=133 valign=top style='width:99.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='font-family:"Courier New"'>([0-9Ol]{4})?<o:p></o:p></span></p>
  </td>
  <td width=505 valign=top style='width:378.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'>This indicates that we need to see any number, an uppercase O or a
  lowercase l. We need to see 4 of them in a row. However, there’s catch here,
  because sometimes people omit the Plus 4, so the sub-expression is surrounded
  by parentheses and followed by the ? quantifier.</p>
  </td>
</tr>
<tr style='mso-yfti-irow:4;mso-yfti-lastrow:yes'>
  <td width=133 valign=top style='width:99.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='font-family:"Courier New";color:blue'>(?= |$)<o:p></o:p></span></p>
  </td>
  <td width=505 valign=top style='width:378.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'>Like the preamble, this indicates that a break or delimiter of some
  sort must follow the zip code.</p>
  </td>
</tr>
</table>
 
<p class=MsoNoSpacing><o:p>&nbsp;</o:p></p>
 
<span style='font-size:11.0pt;line-height:115%;font-family:"Calibri","sans-serif";
mso-ascii-theme-font:minor-latin;mso-fareast-font-family:Calibri;mso-fareast-theme-font:
minor-latin;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;mso-ansi-language:EN-US;mso-fareast-language:
EN-US;mso-bidi-language:AR-SA'><br clear=all style='mso-special-character:line-break;
page-break-before:always'>
</span>
 
<p class=MsoNormal><o:p>&nbsp;</o:p></p>
 
<p class=MsoNormal>Now that we designed our regular expression, we need to add
it to the PostalCodeRegEx table:</p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>[PostalCodeRegEx]<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:12.0pt;font-family:"Courier New"'>5,<span style='color:blue'>(?&lt;=^|
)</span>[0-9Ol]{5}<span style='color:#00B050'>[- ]?</span>([0-9Ol]{4})?<span
style='color:blue'>(?= |$)</span><o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><o:p>&nbsp;</o:p></p>
 
<p class=MsoNormal>When a regular expression finds a match, the match is
removed from the input data, so a later expression (which may be more fitting)
will not find the match. Thus, processing order is important. However, we can’t
simply sort regular expressions like we do with lookup tables. Instead, you
must provide a number (the 5 in our example) which will indicate it’s place in
the regular expression processing order. The lower the number, the sooner it is
processed.</p>
 
<p class=MsoNormal>Generally, you want expressions that capture larger amounts
of data to precede expressions that capture smaller amounts. Our ‘canned’
expressions start at 10 and increment by 10. There are usually not more than 5
or so expressions per table. This example ensures that this expression is the
first to be evaluated. However, in this case, it is not likely that order would
have made a difference, as it does not conflict with any of the existing
expressions.</p>
 
<p class=MsoNormal>Experienced reg-exers may be concerned about the preamble
and post amble, as it would appear that we’ve forgotten to include many
delimiters (ie, tab, pipe, carriage returns, etc). For the purposes of regular
expression processing, all delimiters are temporarily transformed into spaces,
so the only things that your regular expression really need to be looking for
are spaces and the ‘start of string’ and ‘end of string’ indicators. There is
one exception to this rule, and that is for the PreProcessRegEx table. Regular
expressions in this table are processed on the raw data, before any transforms
are done. </p>
 
<p class=MsoNormal><b style='mso-bidi-font-weight:normal'><span
style='font-size:14.0pt;line-height:115%'>Case Study 3: Regular Expression for
general data patterns<o:p></o:p></span></b></p>
 
<p class=MsoNormal>We briefly mentioned the PreProcessRegEx table in the
previous example. This can be one of the most powerful tables at your disposal,
as it allows you to address many character-based anomalies that you may see in
your data. In addition, it is the only table that allows you to perform regular
expression-based replacements.</p>
 
<p class=MsoNormal>Say your data sometimes contains a dash between the state
and zip code (for example, “Braintree, MA-02184”). This data anomaly will befuddle
Right Fielder’s state and zip code recognition abilities. However, we can
easily fix this with an entry in the PreProcessRegEx table:</p>
 
<p class=MsoNoSpacing><span style='font-size:9.0pt;font-family:"Courier New"'>[PreProcessRegEx]<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><span style='font-size:9.0pt;font-family:"Courier New"'>5,(?&lt;=^|
|,|[|]|\t|\r|\n)([a-z]{2,3})(?:[-]?)(\d{5}(?:-\d{4})?)(?=$| |,|[|]|\t|\r|\n),$1
$2<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><o:p>&nbsp;</o:p></p>
 
<p class=MsoNormal>The regular expression can be broken up into these parts:</p>
 
<table class=MsoTableGrid border=0 cellspacing=0 cellpadding=0
style='border-collapse:collapse;border:none;mso-yfti-tbllook:1184;mso-padding-alt:
0in 5.4pt 0in 5.4pt;mso-border-insideh:none;mso-border-insidev:none'>
<tr style='mso-yfti-irow:0;mso-yfti-firstrow:yes'>
  <td width=157 valign=top style='width:117.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'>(?&lt;=^|
  |,|[|]|\t|\r|\n)<o:p></o:p></span></p>
  </td>
  <td width=481 valign=top style='width:360.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'>The
  regular expression’s preamble. Unlike the other tables, we must search for
  all sorts of delimiters and breaks, as pre-process regular expressions are
  matched to the raw input data.<o:p></o:p></span></p>
  </td>
</tr>
<tr style='mso-yfti-irow:1'>
  <td width=157 valign=top style='width:117.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'>([a-z]{2})<o:p></o:p></span></p>
  </td>
  <td width=481 valign=top style='width:360.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'>A
  two-letter sequence (ie, the state). Note that regular expressions are
  processed case-insensitive (though you can override with the ?-i: option).<o:p></o:p></span></p>
  </td>
</tr>
<tr style='mso-yfti-irow:2'>
  <td width=157 valign=top style='width:117.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'>(?:[-])<o:p></o:p></span></p>
  </td>
  <td width=481 valign=top style='width:360.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'>The
  offending dash. We’ve attached the non-capturing group construct (?:, as we’ll
  be throwing this group away.<o:p></o:p></span></p>
  </td>
</tr>
<tr style='mso-yfti-irow:3'>
  <td width=157 valign=top style='width:117.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'>(\d{5}(?:-\d{4})?)<o:p></o:p></span></p>
  </td>
  <td width=481 valign=top style='width:360.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'>The
  Zip Code (and optional Plus 4).<o:p></o:p></span></p>
  </td>
</tr>
<tr style='mso-yfti-irow:4;mso-yfti-lastrow:yes'>
  <td width=157 valign=top style='width:117.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'>(?=$|
  |,|[|]|\t|\r|\n)<o:p></o:p></span></p>
  </td>
  <td width=481 valign=top style='width:360.9pt;padding:0in 5.4pt 0in 5.4pt'>
  <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
  normal'><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'>The
  post amble.<o:p></o:p></span></p>
  </td>
</tr>
</table>
 
<p class=MsoNoSpacing><o:p>&nbsp;</o:p></p>
 
<p class=MsoNormal><o:p>&nbsp;</o:p></p>
 
<p class=MsoNormal>When a regular expression is found, we can use the “$1 $2”
to perform the replacements. Each numbered $ entry indicates a capture group.
In our example, $1 indicates whatever is captures by the first capture group
([a-z]{2}), and $2 indicates what is captured by the second capture group <span
style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:minor-latin'>(\d{5}(?:-\d{4})?).
Non-capture groups (the ones starting with (?:) are not counted.<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:14.0pt'>Case Study 4: Example mdRightFielder.cfg overrides<o:p></o:p></span></b></p>
 
<p class=MsoNoSpacing><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'><o:p>&nbsp;</o:p></span></p>
 
<p class=MsoNoSpacing><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'>[PreProcessRegEx]<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin;color:blue'>4,(?&lt;=^|
|[|]|\t|\r|\n)(.*)(a/o|A/O|c/o|C/O)(.*)(?=$| |[|]|\t|\r|\n),$1 | $3<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'><o:p>&nbsp;</o:p></span></p>
 
<p class=MsoNoSpacing><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'>This expression will allow you to identify ‘</span><span
style='font-family:"Courier New"'>XYZ Corporation a/o Billy McMailreceiver’ as<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><span style='font-family:"Courier New"'><o:p>&nbsp;</o:p></span></p>
 
<p class=MsoNoSpacing><span style='font-family:"Courier New";color:blue'><span
style='mso-spacerun:yes'>        </span>Name1: Billy McMailreceiver<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><span style='font-family:"Courier New";color:blue'><span
style='mso-spacerun:yes'>     </span>Company1: XYZ Corporation<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'><o:p>&nbsp;</o:p></span></p>
 
<p class=MsoNoSpacing><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'>Without this expression the output will be…<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'><o:p>&nbsp;</o:p></span></p>
 
<p class=MsoNoSpacing><span style='font-family:"Courier New";color:blue'><span
style='mso-spacerun:yes'>     </span>Company1: XYZ Corporation a/o Billy
McMailreceiver<o:p></o:p></span></p>
 
<p class=MsoNormal><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'><o:p>&nbsp;</o:p></span></p>
 
<p class=MsoNoSpacing>[LeftToken]</p>
 
<p class=MsoNoSpacing><span style='color:#00B050'>; <span
style='mso-spacerun:yes'> </span>these entries help uncommon names get
recognized as names<o:p></o:p></span></p>
 
<p class=MsoNoSpacing>SMOKY THE BEAR,F</p>
 
<p class=MsoNoSpacing>DONTRELL,F</p>
 
<p class=MsoNoSpacing>BOGDAN,F</p>
 
<p class=MsoNoSpacing><span style='color:#00B050'><o:p>&nbsp;</o:p></span></p>
 
<p class=MsoNoSpacing><span style='color:#00B050'>;<span
style='mso-spacerun:yes'>  </span>many tokens which appear to identify
departments are actually entered as Company identifiers by default.<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><span style='color:#00B050'>;<span
style='mso-spacerun:yes'>  </span>the following <span
style='mso-spacerun:yes'> </span>entries create distinct department identifiers
, overriding tokens sometimes present in companies<o:p></o:p></span></p>
 
<p class=MsoNoSpacing>IT DEPARTMENT,T</p>
 
<p class=MsoNoSpacing>SALES DEPARTMENT, T</p>
 
<p class=MsoNoSpacing>QA, T</p>
 
<p class=MsoNormal><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'><o:p>&nbsp;</o:p></span></p>
 
<p class=MsoNoSpacing>[RightToken]</p>
 
<p class=MsoNoSpacing><span style='color:#00B050'>; this expands identification
of unheard of, changed, or vanity city names<o:p></o:p></span></p>
 
<p class=MsoNoSpacing>ANYTOWN,T,,100</p>
 
<p class=MsoNoSpacing><span style='color:#00B050'>; alternate spelling or
fictional country<o:p></o:p></span></p>
 
<p class=MsoNoSpacing>SHIRE,I,,</p>
 
<p class=MsoNoSpacing><o:p>&nbsp;</o:p></p>
 
<p class=MsoNoSpacing>[PhoneTypeToken]</p>
 
<p class=MsoNoSpacing><span style='color:#00B050'>; yesterdays or tomorrows
phone identifiers<o:p></o:p></span></p>
 
<p class=MsoNoSpacing>MOBILE</p>
 
<p class=MsoNoSpacing>PAGER</p>
 
<p class=MsoNoSpacing>BLACKBERRY</p>
 
<p class=MsoNormal><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'><o:p>&nbsp;</o:p></span></p>
 
<p class=MsoNormal><b style='mso-bidi-font-weight:normal'><span
style='font-size:14.0pt;line-height:115%'>Case Study 5: NOTES<o:p></o:p></span></b></p>
 
<p class=MsoNormal>When creating and writing a regular expression in the cfg
file, the only commas allowed are the instances which delimit the &lt;id&gt;
the &lt;regEx&gt; and the &lt;replace&gt;. This is how RightFielder parses the
cfg file edits.</p>
 
<p class=MsoNormal>Example of a valid entry ….</p>
 
<p class=MsoNoSpacing><span style='font-size:9.0pt;font-family:"Courier New"'>[PreProcessRegEx]<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:9.0pt;font-family:"Courier New";color:#00B050'>5</span></b><span
style='font-size:9.0pt;font-family:"Courier New"'>,(?&lt;=^|
|,|[|]|\t|\r|\n)([a-z]{2,3})(?:[-]?)(\d{5}(?:-\d{4})?)(?=$| |,|[|]|\t|\r|\n),$1
$2<o:p></o:p></span></p>
 
<p class=MsoNormal><o:p>&nbsp;</o:p></p>
 
<p class=MsoNormal>Example of an invalid entry (expression will be ignored….</p>
 
<p class=MsoNoSpacing><span style='font-size:9.0pt;font-family:"Courier New"'>[PreProcessRegEx]<o:p></o:p></span></p>
 
<p class=MsoNoSpacing><b style='mso-bidi-font-weight:normal'><span
style='font-size:9.0pt;font-family:"Courier New";color:#00B050'>5</span></b><span
style='font-size:9.0pt;font-family:"Courier New"'>,(?&lt;=^| <b
style='mso-bidi-font-weight:normal'><span style='color:red'>|,|</span></b>[|]|\t|\r|\n)([a-z]{2,3})(?:[-]?)(\d{5}(?:-\d{4})?)(?=$|
<b style='mso-bidi-font-weight:normal'><span style='color:red'>|,|</span></b>[|]|\t|\r|\n),$1
$2<o:p></o:p></span></p>
 
<p class=MsoNormal><o:p>&nbsp;</o:p></p>
 
<p class=MsoNormal><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'>The &lt;id&gt; in the above example must be unique. If you create
new expressions with the same &lt;id&gt; as another cfg edit or an existing
mdRightFielder.dat entry, it will be ignored.<o:p></o:p></span></p>
 
<p class=MsoNormal><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'>Existing pattern &lt;id&gt;s <span
style='mso-spacerun:yes'> </span>in mdRightFielder.dat file start with id=10
and imcrement by 10, so to override and existing pattern, use a single digit
&lt;id&gt;, to add a lower priority use a higher multiple.<o:p></o:p></span></p>
 
<p class=MsoNormal><span style='mso-bidi-font-family:Calibri;mso-bidi-theme-font:
minor-latin'><o:p>&nbsp;</o:p></span></p>
 
</div>
 
</body>
 
</html>

Revision as of 18:04, 24 February 2014

mdRightFielder.cfg: Case Studies The mdRightFielder.cfg file is a plain text file that users can use to tailor Right Fielder Object’s behavior to meet their specific needs. Generally, this is used when a user has input data with a specific quirk or characteristic that Right Fielder on its own can’t handle properly. This file is used to override the default entries from the stock mdRightFielder lookup tables contained in the mdRightFielder.dat data file. By default, both mdRightFielder.dat and mdRightFielder.cfg are installed in… C:\Program Files\Melissa DATA\DQT\Data or the respective Melissa Data data directory in UNIX type OS installations. For complete instructions of available tables and types which can be overridden, as well as syntax and examples, open the mdRightFielder.cfg in a text editor and follow the instructions. There are 3 types of modifications that can be made in mdRightFielder.cfg: • Lookup Table– The addition or removal of words (and phrases) to the Object’s dictionaries. This essentially expands (or limits) Right Fielder’s vocabulary. • Regular Expression– The addition of regular expressions that are used to recognize specific character patterns (for example, phone numbers, e-mails, etc). • Pattern Table – The addition or removal of patterns of words and phrases. Words and phrases are first identified via Lookup Tables and assigned tokens (specified in the Lookup Table itself). Sequences of tokens (patterns) are matched to entries in this table and transformed into output data.