Build History:MatchUp Object

From Melissa Data Wiki
Revision as of 23:00, 25 July 2013 by Admin (talk | contribs)
Jump to navigation Jump to search


Build 2154

Released April 2013

Functionality Changes
  • Fixed an issue when temporary sort files grew over 4GB (shown in very large processes)
Interface Changes
  • SetReserved("UserInfoSize",""), users can use to override default userinfo size of 1024 bytes per record. Smaller numbers like "32" greatly reduce disk space requirements and processing times.


Build 2009

Released July 2012

Important
Environment
MatchUp now uses the new Melissa Data license format and ENV variable name.
Please call a sales representative at 1-800-MELISSA ext. 3 (1-800-635-4772 x3) for a valid license string.
MatchUp will also check for a valid license in the MD_LICENSE(Environment) variable. This allows you to modify the license without recompiling the project
The default installation path for the MatchUp data files is now "c:\ProgramData\Melissa DATA\MatchUP\" for newer operating systems. Please check your 'Common App Data' directory for older OSs, for example: it may be "c:\Documents and Settings\All Users\Application Data\Melissa DATA\MatchUP".
Matchcode Editor
The Matchcode Editor has been redesigned. Existing matchcodes can still be read/edited/used for backwards compatibility. The new Matchcode Editor allows us to provide a common interface with the MatchUp SSIS component
Functionality Changes
Removal of certain matchcode restrictions that were required in existing version. This allows users to use an Intersecting matching logic - meaning you can create matchcodes without a common first component, making it more flexible
SQL Server Interface: There is now a set of CLR-based procedures. A seperate installer script is provided as an alternative to the _xp and UDF methods of interfacing the MatchUp library.
Interface Changes
Naming conventions for Interface Wrappers have been changed/updated to use the same naming convention as other Melissa Data Objects
Added Matchcode Component types
  • Date (days)
  • Numeric (integer unit)
  • Proximity (miles)
These new types allow you to specify, as a component property, a range for which a match will be possible if the records being compared fall within the set range. There are also new Enumerations, Mapping Targets and Matchcode Mappings for these new component types.
Added addition Fuzzy Algorithms
  • N-Gram
  • Jaro Distance
  • Jaro-Winkler Distance
  • Longest Common Substring
  • Jaccard Index
  • Dices Coefficient
  • Overlap Coefficient
  • Needleman-Wunsch
  • Smith-Waterman-Gotoh
  • MD Keyboard
  • Double Metaphone
New Interface
  • mdMUMatchcodeList - allows the user to list available matchcodes
  • SetPathToMatchUpFiles()
  • InitializeDataFiles()
  • GetInitializeErrorString()
  • GetMatchcodeCount() - retrieves number of matchcodes.
  • GetMatchcodeName() - retrieves name of matchcode at specified position.
New Methods
  • GetResults() - retrieves matching result codes.
  • GetNearDbl() - retrieves Near setting (supercedes GetNear()).
  • SetNearDbl() - sets Near setting (supercedes SetNear()).
  • GetDescription() - retrieves a matchcodes user-specified description.
  • SetDescription() - sets a matchcodes user-specified description.
  • GetNGram() - retrieves a matchcodes n-gram setting.
  • SetNGram() - sets a matchcodes n-gram setting.
  • RenameMatchcode() - change a matchcode’s name.
  • DeleteMatchcode() - delete a matchcode.
Deprecated
  • GetStatusCode and GetCombinations have been deprecated as GetResults gives the user a single output property to retrieve this information.
Sample Code
Previous examples have been changed, with GetStatusCode and GetCombintions being replaced by GetResults to reflect usage in retrieving ouput properties
Hybrid examples have been added to most samples and interfaces


Build 1459

Released August 2009

  • The MatchUp Object disc now includes all supported platforms:Linux, Windows, and Solaris, and AIX.
  • The SetUserInfo and GetUserInfo functions now work with 'const char *' data types rather than a 'void *' for all platforms/languages.
  • Removed Set/GetSizeUserInfo functions, as there was no need for them anymore. The maximum size for the UserInfo string is 1024 bytes.
  • Moved the enums into their own file mdMatchupEnums.h to make the Object more compatible with our other objects. In doing this, some enumerations were changed (see below).
  • The MatchcodeComponentType enums now have 'Comp' appended at the end of each label.
  • The MatchcodeMappingType enumeration was named to MatchcodeMappingTarget. MatchcodeMappingTarget enums all have 'Type' appended at the end of each label.
  • The ProgramStatus enum NoError has been renamed to ErrorNone.


Build 1451

Released July 2009

Initial public release
Multi Platform, new version, new interface, incompatible with legacy MatchUP API. Please refer to product page for details.
Deprecated products
  • MatchUp API, DoubleTake API


MatchUp API Build History

Release 3.13

Released December 2006

  • Fixed a crash that sometimes occurred with some programming languages when a series of processes was launched one after another.
  • Examples: Added Visual C# example projects.
  • Matchcode Editor: Added Frequency Near matching.
  • Matchcode Editor: Added a check to ensure that a user couldn't exit unless all matchcodes had at least one component and one combination.
  • ".5 Main Street" wasn't being split properly.
  • Some 'fractions' should not be treated as such. For example, "123 State Route 28/18".
  • Processing: Phone numbers such as 999-999-9999 and 000-000-0000 are now intrepreted as blank.


Release 3.12

Released July 2006

  • Processing: Matchcodes having a large key size (over 250 characters) could cause a crash. NOTE: Matchcodes having a large key size are usually a sign of a poorly designed matchcode.
  • Processing: Canadian addresses such as "101-38 Main St" should match "38 Main St Apt 101". In the US, the same address would match "101 Main St Apt 38".
  • Processing: Inferred matching (A=B, B=C, so A=C) has been improved so that: (a) Inferences are detected between differing file types (suppression, regular, etc), (b) Inferences are detected regardless of record ranking, and (c) The correct output record is selected.
  • Processing: Alphabetic PO Boxes like "PO Box AVG" are more likely to be split properly now.
  • Processing: "12 Avenue 29" is now split properly.
  • Processing: The phonetex representation of "P" (just the single letter) was incorrect.
  • Matchcode Editor: Fixed a refresh anomaly when working with a matchcode having a single component.
  • Processing: "85 State Road" wasn't being split properly.
  • Street Splitter: "816 W North Loop Rd", "3750 I 55 N", "876 FM 365 Rd", "15 FM 18", "56 E Loop 281", and "540 S Interstate 36 E" are now split properly.
  • Merge/Purge: Validate Zip matchcodes now better handle addresses having both a street address and PO Box.
  • Added "Henry", "Hank", "Chuck", "Chucky" and "Chuckie" to Nickname table.
  • Processing: Leading and trailing spaces in an e-mail's user name are now stripped.
  • Street Splitter: Certain street addresses with slashes are now handled better (as in "300 Oak St/PO Box 12").
  • Street Splitter: Addresses like "268 Hwy 202/31" are now split properly.
  • Street Splitter: Splitting of "Farm to Market" (FM) addresses is now handled better.
  • Street Splitter: Addresses with decimals were losing their decimal point.
  • Name Splitter: Ampersands (&) are now handled better.
  • Street Splitter: Added several MLK variations, BL (Boulevard or Building) and TE (Terrace).
  • Street Splitter: Improved the handling of bad addresses such as "15 Main St-18".
  • Street Splitter: Secondaries such as "Apt 15-58" are now converted into "1558" for the purposes of deduping.
  • Street Splitter: Sloppy addresses such as "15-1/2 Main St" now get a correct street number of "15.5".
  • Processing: Fixed a random crash that had been very difficult to isolate.
  • Matchcodes: Containment matching sometimes didn't consider the very last character of the shorter string in the comparison (ie, "Kim" would be contained in "Kid").
  • Matchcodes: Tweaked behavior of Fast Near matching, especially with short strings.


Release 3.11

Released September 2005

  • Street Splitting: Addresses with "St." in the street name aren't automatically converted to "Street", as in "100 St. Mary's St."
  • Street Splitting: Secondaries such as "Apt-18", "Apt 18GF", "Apt 18 GF" and "Apt GF 18" are now handled better.
  • Matchcode Editor: Components that will be used in determining clusters are displayed with a light green background. This knowledge can be used to build more optimal matchcodes. See concepts for more information about clustering.
  • Matchcode Editor: The Swap Match dialog will now use component labels if they are known.
  • Full Name Splitter: "Charles O. Leary" was winding up with a last name of "Oleary" instead of a Middle Initial of "O" and a Last Name of "Leary". The problem was specific to O' names that could appear with or without the O' prefix.
  • Government Name Splitter: Names with suffixes like "Public, John Q. MD" are now split correctly.


Release 3.10

Released April 2005

  • Matchcodes: State processing has been enhanced to convert full spellings to the 2-letter abbreviation, as well as fix some common Canadian problems ("QC" and "PQ").
  • Matchcodes: The "Stop at Word" option sometimes stopped one word too soon.
  • Swap Matching has been improved in the Half Swap configuration
  • Matchcodes: Full names such as "Mr.Johnson" (with a period but no space between the "Mr" and "Johnson") weren't being split properly.
  • Matchcodes: "Start at Position" is now performed before SoundEx, Phonetex, Consonants Only, Numerics Only, Vowels Only, and Alphas Only. For example, if "Start at Position" was 2 with "Consontants Only", "Ableson" would use "lsn", rather than "blsn".
  • The API's code has been synchronized with the recently released Double Take 3 (GUI). The previous version of the API was built in between Double Take 2 and Double Take 3. Although it had many of Double Take 3's features, it's processing core was built on Double Take 2.
  • A Java interface has been added.
  • The address splitter interface has been exposed so that users can now parse street addresses.
  • A CASS interface has been added. This optional add-on allows you to CASS validate addresses.
  • CASS-validated Zip, Plus 4 and Street Address components are now available as Matchcode Components (with CASS add-on).
  • United Kingdom City, County and Post Code are now available as Matchcode Components.
  • The Custom component type has been added. You can use this component to perform table-based substitutions while building matchcode keys.
  • You now have more control over Swap Matching.
  • One Blank Field and Initial Only matching has been improved.
  • A debugging interface has been added to help developers troubleshoot problems.
  • Double Take can now automatically optimize your matchcode with the Optimize button in the Matchcode Editor.
  • You can import Double Take 2 matchcodes with the Import button in the Matchcode Editor.
  • The SQL Server 'bridge' DLL is no longer used; synchronizing this DLL with the API's DLL was often a problem.
  • The SQL Server Extended Stored Procedures have been re-tooled to better report errors.


Release 3.01

Released December 2002

  • Added function support for SQL Server 2000.
  • Corrected typo in help file regarding MP_FIRST confusion.
  • Corrected incorrect EM_COMBOn defines.


Release 3.00

Released November 2002

  • First Release