Difference between revisions of "Matchcode Optimization:Containment"
Jump to navigation
Jump to search
(Created page with "{{MatchcodeOptimizationNav |AlgorithmsCollapse= }} ==Containment== ===Specifics=== Matches when one record's component is contained in another record. For example, “Smith...") |
|||
Line 5: | Line 5: | ||
==Containment== | ==Containment== | ||
===Specifics=== | ===Specifics=== | ||
Matches when one record's component is contained in another record. For example, “Smith” is contained in “Smithfield.” | :Matches when one record's component is contained in another record. For example, “Smith” is contained in “Smithfield.” | ||
===Summary=== | ===Summary=== | ||
This algorithm looks at the record’s component and determines whether that component is contained in the record it is attempting to match. | :This algorithm looks at the record’s component and determines whether that component is contained in the record it is attempting to match. | ||
===Returns=== | ===Returns=== | ||
Returns true if one record’s component is contained in another record. | :Returns true if one record’s component is contained in another record. | ||
===Example Matchcode Component=== | ===Example Matchcode Component=== | ||
Line 34: | Line 34: | ||
===Recommended Usage=== | ===Recommended Usage=== | ||
Hybrid Deduper, where a single incoming record can quickly be evaluated independently against each record in an existing large master database. | :Hybrid Deduper, where a single incoming record can quickly be evaluated independently against each record in an existing large master database. | ||
Batch or Enterprise runs where the first component allows efficient clustering. | :Batch or Enterprise runs where the first component allows efficient clustering. | ||
Databases where unrecognized keyword variations appear in some of the records. | :Databases where unrecognized keyword variations appear in some of the records. | ||
When the entire value or string of one is contained in the other. | :When the entire value or string of one is contained in the other. | ||
===Not Recommended For=== | ===Not Recommended For=== | ||
Short name string comparison. | :Short name string comparison. | ||
Gather/scatter, survivorship, or record consolidation of sensitive data. | :Gather/scatter, survivorship, or record consolidation of sensitive data. | ||
Quantifiable data or records with proprietary keywords not associated in our knowledgebase tables. | :Quantifiable data or records with proprietary keywords not associated in our knowledgebase tables. | ||
===Do Not Use With=== | ===Do Not Use With=== | ||
UTF-8 data. This algorithm was ported to MatchUp with the assumption that a character equals one byte, and therefore results may not be accurate if the data contains multi-byte characters. | :UTF-8 data. This algorithm was ported to MatchUp with the assumption that a character equals one byte, and therefore results may not be accurate if the data contains multi-byte characters. | ||
[[Category:MatchUp Hub]] | [[Category:MatchUp Hub]] | ||
[[Category:Matchcode Optimization]] | [[Category:Matchcode Optimization]] |
Latest revision as of 23:00, 26 September 2018
Containment
Specifics
- Matches when one record's component is contained in another record. For example, “Smith” is contained in “Smithfield.”
Summary
- This algorithm looks at the record’s component and determines whether that component is contained in the record it is attempting to match.
Returns
- Returns true if one record’s component is contained in another record.
Example Matchcode Component
Example Data
STRING1 STRING2 RESULT Johnson Jhnsn Unique Mild Hatter Mild Hatter Wks Match Smith Smithfield Match Melissa Eli Match
Performance | |||||
---|---|---|---|---|---|
Slower | Faster | ||||
Matches | |||||
More Matches | Greater Accuracy |
Recommended Usage
- Hybrid Deduper, where a single incoming record can quickly be evaluated independently against each record in an existing large master database.
- Batch or Enterprise runs where the first component allows efficient clustering.
- Databases where unrecognized keyword variations appear in some of the records.
- When the entire value or string of one is contained in the other.
Not Recommended For
- Short name string comparison.
- Gather/scatter, survivorship, or record consolidation of sensitive data.
- Quantifiable data or records with proprietary keywords not associated in our knowledgebase tables.
Do Not Use With
- UTF-8 data. This algorithm was ported to MatchUp with the assumption that a character equals one byte, and therefore results may not be accurate if the data contains multi-byte characters.