Difference between revisions of "Issues:MatchUp Object"

From Melissa Data Wiki
Jump to navigation Jump to search
Line 12: Line 12:


==Fuzzy: First component with set distance missing dupes==  
==Fuzzy: First component with set distance missing dupes==  
Setting a distance for a first component forces the component to use the Intersecting deduper. This may result in records within a set distance to be put in different clusters, and therefore may never get compared.
Setting a distance for a first component forces the component to use the Intersecting deduper. This may result in records within a set distance to be put in different clusters, and therefore may never get compared. <br>
Workaround: Use an exact algorithm in the first component and keep a distance component, if required further down the component list. This will prevent missed dupes (and give you better speed benchmarks.
Workaround: Use an exact algorithm in the first component and keep a distance component, if required further down the component list. This will prevent missed dupes (and give you better speed benchmarks. <br>
Resolution: This may require an advanced change to the deduper. Development is aware of the issue and is exploring options.
Resolution: This may require an advanced change to the deduper. Development is aware of the issue and is exploring options.



Revision as of 15:29, 2 July 2014


Large KeyFile Size effect on Memory resources

By default, MatchUp object allocates a large SetUserInfo, the unique identifier attached to built match key - 1024 bytes. See MatchUp Object Best Practices for override instructions.


Fuzzy: Legacy Matchcodes

Legacy Matchcodes, imported from previous versions, allowed a Fuzzy: Near setting of '0'. This is incompatible with the current version and can cause an error. Using the interface to edit the matchcode by changing the Distance to 1 will resolve the problem


Fuzzy: First component with set distance missing dupes

Setting a distance for a first component forces the component to use the Intersecting deduper. This may result in records within a set distance to be put in different clusters, and therefore may never get compared.
Workaround: Use an exact algorithm in the first component and keep a distance component, if required further down the component list. This will prevent missed dupes (and give you better speed benchmarks.
Resolution: This may require an advanced change to the deduper. Development is aware of the issue and is exploring options.



Back to MatchUp Object Main Page