Matchcode Optimization:Frequency Near
Frequency Near
Specifics
The Frequency near algorithm will match the characters of one string to the characters of another without any regard to the sequence while allowing a set number of differences.
Summary
Frequency Near can be used when 2 strings are expected to have the same characters, but might be transposed or have an insertion or deletion. For example "abcdef" would be considered a 100% match to "badcfe" or “badcfx”.
Returns
Boolean ‘match’ if the compared data has the same values.
Example Matchcode Component
Example Data
STRING1 STRING2 RESULT Johnson Jhnsn Match Lynda Dylan Match A B D H T A T H D X Match A B D H T A T H D B Match
Performance | |||||
---|---|---|---|---|---|
Slower | Faster | ||||
Matches | |||||
More Matches | Greater Accuracy |
Recommended Usage
Batch processing—this is a fast algorithm which will identify a greater percentage of duplicates because it will count exact matches and minor character transpositions.
This algorithm is also recommended when the data is comprised of single character dictionary values like ‘A B C’.
Not Recommended For
Short name data types where a simple character transformation would represent a different value. This algorithm is also not recommended when trying to identify differences in long strings.
Do Not Use With
UTF-8 data. This algorithm was ported to MatchUp with the assumption that a character equals one byte, and therefore results may not be accurate if the data contains multi-byte characters.