Matchcode Optimization:Frequency
Frequency
Specifics
- The Frequency algorithm will match the characters of one string to the characters of another without any regard to the sequence.
Summary
- Frequency can be used when 2 strings are expected to have the same characters and are of the same length - for example, "abcdef" would be considered a 100% match to "badcfe." But should not be used to match a variant number of characters. For example “wxyz” would not match “wzy” nor “wzzy”
Returns
- Boolean ‘match’ if the compared data has the same values.
Example Matchcode Component
Example Data
STRING1 STRING2 RESULT Johnson Jhnsn Unique Johnson Johnosn Match Found Lynda Dylan Match Found A B D H T A T B H D Match Found
Performance | |||||
---|---|---|---|---|---|
Slower | Faster | ||||
Matches | |||||
More Matches | Greater Accuracy |
Recommended Usage
- Batch processing—this is a fast algorithm which will identify a greater percentage of duplicates because it will count exact matches and minor character transpositions.
- This algorithm is also recommended when the data is comprised of single character dictionary values like ‘A B C’.
Not Recommended For
- Short name data types where a simple character transformation would represent a different value. This algorithm is also not recommended when trying to identify differences in long strings.
Do Not Use With
- UTF-8 data. This algorithm was ported to MatchUp with the assumption that a character equals one byte, and therefore results may not be accurate if the data contains multi-byte characters.