Matchcode Optimization:Consonants: Difference between revisions
Jump to navigation
Jump to search
Created page with "{{MatchcodeOptimizationNav |AlgorithmsCollapse= }} ==Consonants== ===Specifics=== Only consonants will be compared. Vowels will be removed. ‘Y’ is defined as a vowel in t..." |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 5: | Line 5: | ||
==Consonants== | ==Consonants== | ||
===Specifics=== | ===Specifics=== | ||
Only consonants will be compared. Vowels will be removed. ‘Y’ is defined as a vowel in this algorithm. | :Only consonants will be compared. Vowels will be removed. ‘Y’ is defined as a vowel in this algorithm. | ||
===Summary=== | ===Summary=== | ||
This algorithm removes vowels from the string and compares two strings based on their consonants. | :This algorithm removes vowels from the string and compares two strings based on their consonants. | ||
===Returns=== | ===Returns=== | ||
Returns a match if two strings’ consonants match exactly. | :Returns a match if two strings’ consonants match exactly. | ||
===Example Matchcode | ===Example Matchcode Component=== | ||
:[[File:MCO_Algorithm_Consonants.png|link=]] | :[[File:MCO_Algorithm_Consonants.png|link=]] | ||
===Example Data | ===Example Data=== | ||
{{ExampleDataTableV1|STRING1|STRING2|RESULT | {{ExampleDataTableV1|STRING1|STRING2|RESULT | ||
|AdditionalRows= | |AdditionalRows= | ||
{{EDTRow| | {{EDTRow|Green|Ron Doe|Ron Doe67|Match}} | ||
{{EDTRow| | {{EDTRow|White|Lynda|Dylan|Unique}} | ||
{{EDTRow|Green|Tim|Tom|Match}} | {{EDTRow|Green|Tim|Tom|Match}} | ||
{{EDTRow| | {{EDTRow|White|Brian|Ian|Unique}} | ||
}} | }} | ||
Line 34: | Line 34: | ||
===Recommended Usage=== | ===Recommended Usage=== | ||
Hybrid Deduper, where a single incoming record can quickly be evaluated independently against each record in an existing large master database. | :Hybrid Deduper, where a single incoming record can quickly be evaluated independently against each record in an existing large master database. | ||
Databases created via real-time data entry where audio likeness errors are introduced. | :Databases created via real-time data entry where audio likeness errors are introduced. | ||
Databases of US and English language origin. | :Databases of US and English language origin. | ||
===Not Recommended For=== | ===Not Recommended For=== | ||
Database where disemvoweling was used to reduce storage space (https://en.wikipedia.org/wiki/Disemvoweling). | :Database where disemvoweling was used to reduce storage space (https://en.wikipedia.org/wiki/Disemvoweling). | ||
===Do Not Use With=== | ===Do Not Use With=== | ||
UTF-8 data. This algorithm was ported to MatchUp with the assumption that a character equals one byte, and therefore results may not be accurate if the data contains multi-byte characters. | :UTF-8 data. This algorithm was ported to MatchUp with the assumption that a character equals one byte, and therefore results may not be accurate if the data contains multi-byte characters. | ||
[[Category:MatchUp Hub]] | [[Category:MatchUp Hub]] | ||
[[Category:Matchcode Optimization]] | [[Category:Matchcode Optimization]] |
Latest revision as of 22:58, 26 September 2018
Consonants
Specifics
- Only consonants will be compared. Vowels will be removed. ‘Y’ is defined as a vowel in this algorithm.
Summary
- This algorithm removes vowels from the string and compares two strings based on their consonants.
Returns
- Returns a match if two strings’ consonants match exactly.
Example Matchcode Component
Example Data
STRING1 STRING2 RESULT Ron Doe Ron Doe67 Match Lynda Dylan Unique Tim Tom Match Brian Ian Unique
Performance | |||||
---|---|---|---|---|---|
Slower | Faster | ||||
Matches | |||||
More Matches | Greater Accuracy |
Recommended Usage
- Hybrid Deduper, where a single incoming record can quickly be evaluated independently against each record in an existing large master database.
- Databases created via real-time data entry where audio likeness errors are introduced.
- Databases of US and English language origin.
Not Recommended For
- Database where disemvoweling was used to reduce storage space (https://en.wikipedia.org/wiki/Disemvoweling).
Do Not Use With
- UTF-8 data. This algorithm was ported to MatchUp with the assumption that a character equals one byte, and therefore results may not be accurate if the data contains multi-byte characters.