RightFielder API:Best Practices: Difference between revisions
Created page with " ==Order Input Data in a Logical Order== BP_RFOB_001 If your database structure has columns in a non-intuitive order, reorder the respective columns to represent a more c..." |
|||
Line 93: | Line 93: | ||
Example: Xyz Smith. Depending on the rest of the input tokens, it may not be recognized as a name, but | Example: Xyz Smith. Depending on the rest of the input tokens, it may not be recognized as a name, but | ||
:: adding 'Xyz' as a name in the cfg file will help process records with previously undefined tokens. | |||
==Supply data beyond minimun input== | ==Supply data beyond minimun input== | ||
BP_RFOB_007 | BP_RFOB_007 |
Revision as of 14:47, 21 February 2014
Order Input Data in a Logical Order
BP_RFOB_001
If your database structure has columns in a non-intuitive order, reorder the respective columns
to represent a more common order (think of the order for a mailing label). This will help RightFielder
process more accurately.
Example Table layout: ADDRESS FULLNAME CSZ
Recommended Order: FULLNAME ADDRESS CSZ
Maintain Hard Delimiters (and Placement)
BP_RFOB_002
If the the table to be processed originally had hard delimeters (tab, pipe, crlf), maintain or restore
these delimiters before processing with RightFielder. This will help Right Fielder process more
accurately.
Example Table layout: FULLNAME FULLNAME2 ADDRESS CSZ
recommended structure: FULLNAME | FULLNAME2 | ADDRESS | CSZ
Only Process required Fields
BP_RFOB_003
If the table contains un-needed fields or proprietary data types, do NOT add them to the input data
to be processed (unless you have created a SetUserPattern to handle them).
Example Table layout: FULLNAME | BANKACCOUNT | ADDRESS | NEIGHBORHOOD_DESCRIPTION | CSZ
recommended input: FULLNAME | ADDRESS | CSZ
note: if the BANKACCOUNT mumbers are always formatted in a consistent manner, create a
SetUserPattern for accurate processing.
Maintain a consistant design of Source Data
BP_RFOB_004
Good input design. How is the data to be RightFielded collected? If you are developing a user interface
with edit controls for input, design for multi-row input which can be padded with hard delimiters by
your application.
Customize handling for Proprietary data types
BP_RFOB_005
If you have proprietary data embedded throughtout your source data, create a
valid regular expression examples for the SetUserPattern method. This helps RightFielder
identify an unknown data format - like a part number or an account number for example - and
will place it in its own category. See the RightFielder documentation and mdRightFielder.cfg eaxample for more details.
Over-ride the distributed Data Files for keywords with alterante meanings
BP_RFOB_006
Data File override. RightFielder allows you to add or override the behavior of known keywords contained
in the default data file mdRightFielder.dat. Open the mdRightFielder.cfg file in a text editor for instructions,
syntax and examples on how to do this.
Example: Xyz Smith. Depending on the rest of the input tokens, it may not be recognized as a name, but
- adding 'Xyz' as a name in the cfg file will help process records with previously undefined tokens.
Supply data beyond minimun input
BP_RFOB_007
One of the previous Best Practices recommends against processing too much input. Isn't this suggestion contrawise?
No, configuring RightFielder is a fine line between not enough input and too much input.
Since tokens can have multiple meanings, input data will be more accurately fielded if a minimum input
is met. While no strict standard can be applied, an input that represents a true contact record should be
expected to process more accurately than an input record with a single input token.
Example, less than minimum input: sven miller
Example, more data: sven miller | anytown, ma 01234