Matchcode Optimization:Exact: Difference between revisions
Jump to navigation
Jump to search
Created page with "{{MatchcodeOptimizationNav |AlgorithmsCollapse= }} ==Exact== ===Specifics=== Determines whether two values are identical. ===Summary=== Two values are compared against each ..." |
No edit summary |
||
Line 5: | Line 5: | ||
==Exact== | ==Exact== | ||
===Specifics=== | ===Specifics=== | ||
Determines whether two values are identical. | :Determines whether two values are identical. | ||
===Summary=== | ===Summary=== | ||
Two values are compared against each other and determined to be a match if they are exactly the same. | :Two values are compared against each other and determined to be a match if they are exactly the same. | ||
===Returns=== | ===Returns=== | ||
Returns a match if two values are exactly the same. | :Returns a match if two values are exactly the same. | ||
===Example Matchcode Component=== | ===Example Matchcode Component=== | ||
Line 21: | Line 21: | ||
{{EDTRow|White|Johnson|Jhnsn|Unique}} | {{EDTRow|White|Johnson|Jhnsn|Unique}} | ||
{{EDTRow|Green|Smith|Smith|Match}} | {{EDTRow|Green|Smith|Smith|Match}} | ||
{{EDTRow| | {{EDTRow|White|Beaumarchais|Bumarchay|Unique}} | ||
{{EDTRow| | {{EDTRow|White|Deanardo|Dinardio|Unique}} | ||
}} | }} | ||
Line 34: | Line 34: | ||
===Recommended Usage=== | ===Recommended Usage=== | ||
Hybrid deduper, where a single incoming record can quickly be evaluated independently against each record in an existing large master database. | :Hybrid deduper, where a single incoming record can quickly be evaluated independently against each record in an existing large master database. | ||
Batch processes where NGRAM is set on a single non-first matchcode component. | :Batch processes where NGRAM is set on a single non-first matchcode component. | ||
Databases created with abbreviations or similar word substitutions. | :Databases created with abbreviations or similar word substitutions. | ||
Multi word field data where a trailing word does not appear in every record in the expected group or data contains acceptable variations of one of the keywords. | :Multi word field data where a trailing word does not appear in every record in the expected group or data contains acceptable variations of one of the keywords. | ||
===Not Recommended For=== | ===Not Recommended For=== | ||
Databases where the number of errors with relation to the string length result is a small number of common substrings. | :Databases where the number of errors with relation to the string length result is a small number of common substrings. | ||
Gather/scatter, survivorship, or record consolidation of sensitive data. | :Gather/scatter, survivorship, or record consolidation of sensitive data. | ||
Quantifiable data or records with proprietary keywords not associated in our knowledgebase tables. | :Quantifiable data or records with proprietary keywords not associated in our knowledgebase tables. | ||
===Do Not Use With=== | ===Do Not Use With=== | ||
UTF-8 data. This algorithm was ported to MatchUp with the assumption that a character equals one byte, and therefore results may not be accurate if the data contains multi-byte characters. | :UTF-8 data. This algorithm was ported to MatchUp with the assumption that a character equals one byte, and therefore results may not be accurate if the data contains multi-byte characters. | ||
[[Category:MatchUp Hub]] | [[Category:MatchUp Hub]] | ||
[[Category:Matchcode Optimization]] | [[Category:Matchcode Optimization]] |
Latest revision as of 23:12, 26 September 2018
Exact
Specifics
- Determines whether two values are identical.
Summary
- Two values are compared against each other and determined to be a match if they are exactly the same.
Returns
- Returns a match if two values are exactly the same.
Example Matchcode Component
Example Data
STRING1 STRING2 RESULT Johnson Jhnsn Unique Smith Smith Match Beaumarchais Bumarchay Unique Deanardo Dinardio Unique
Performance | |||||
---|---|---|---|---|---|
Slower | Faster | ||||
Matches | |||||
More Matches | Greater Accuracy |
Recommended Usage
- Hybrid deduper, where a single incoming record can quickly be evaluated independently against each record in an existing large master database.
- Batch processes where NGRAM is set on a single non-first matchcode component.
- Databases created with abbreviations or similar word substitutions.
- Multi word field data where a trailing word does not appear in every record in the expected group or data contains acceptable variations of one of the keywords.
Not Recommended For
- Databases where the number of errors with relation to the string length result is a small number of common substrings.
- Gather/scatter, survivorship, or record consolidation of sensitive data.
- Quantifiable data or records with proprietary keywords not associated in our knowledgebase tables.
Do Not Use With
- UTF-8 data. This algorithm was ported to MatchUp with the assumption that a character equals one byte, and therefore results may not be accurate if the data contains multi-byte characters.