Difference between revisions of "Build History:MatchUp Object"

From Melissa Data Wiki
Jump to navigation Jump to search
Line 1: Line 1:
[[MatchUp Object|← MatchUp Object]]
[[MatchUp Object|← MatchUp Object]]
{{CustomTOC}}
{{CustomTOC}}
==Build 5118==
''Released January 2022''
*Fixed: Sorting issues when processing large files in linux
*Updated Global datafiles. (Global MatchUp)
==Build 5112==
==Build 5112==
''Released October 2021''
''Released October 2021''

Revision as of 21:14, 13 January 2022

← MatchUp Object


Build 5118

Released January 2022

  • Fixed: Sorting issues when processing large files in linux
  • Updated Global datafiles. (Global MatchUp)


Build 5112

Released October 2021

  • Maintenance Build
  • Updated Global datafiles. (Global MatchUp)


Build 5109

Released August 2021

  • Updated datafiles (Domestic) address patterns for TRLR and SLIP addresses.
  • Updated Global datafiles. (Global)
  • Countries added to Full Support Status: LU, PT. (Global MatchUp)
  • Linux .sh scripts may still require dos2unix conversion.
  • DataFile change: mdGlobalAddr.sac added
  • DataFile change: mdGlobalAddr.3db removed
  • DataFile change: icudt52l.dat removed


Build 5083

Released April 2020

  • Fixed Keybuilding issue for Numerics only Fuzzy Algorithm
  • Updated Global datafiles. (Global MatchUp)


Build 5070

Released January 2020

  • Deprecated linux/gcc34 library distribution and support
  • Fixed Critical Global Keybuilding issue for Vietnam (unsupported) and France
  • Updated Global datafiles. (Global MatchUp)


Build 5060

Released August 2019

  • Maintenance Build


Build 5058

Released May 2019

  • Maintenance Build


Build 5054

Released January 2019

  • Maintenance Build


Build 5050

Released October 2018

  • Matchcode Editor Optimization - Interactive evaluation of user created matchcodes with links to MatchUp documentation
  • Countries added to Full Support Status: IT, VA, BE, SM, GH, LS, ME, FR-Territories. (Global MatchUp)


Build 5046

Released August 2018

  • Improved CityStateZip splitting for non-standardized Canadian records.
  • Improved support for viewing Matchcode Keys with extended characters.
  • Countries added to Full Support Status: CZ, LI, ZA. (Global MatchUp)
  • Added/Edited recommendations in documentation regarding advanced matchcodes.


Build 5028

Released April 2018

  • Maintenance Build


Build 5023

Released February 2018

  • Global Address Matching: LT, FI, PL, FR, SG added to full support status


Build 5015

Released November 2017

Maintenance Build
  • Improved (Rural) Route processing
  • Speed Optimizations
  • Improved Territory support for UK Crown dependencies (Guernsey, Isle of Man, Jersey)


Build 5009

Released September 2017

Maintenance Build
  • Countries added to Full Support Status: HR, IS, NO (Global)
  • Country support for US Territories: AS, FM, GU, PW, MH, MP ,PR, VI (Domestic and Global)
  • No new methods or properties.


Build 2929

Released March 2017

  • SetEncoding() and SetMaximumCharacterSize() added to SQL-CLR installed functions
  • MatchUp Object and Global MatchUp Object have been separated into two distinct products
  • New Matchcode Component: Post Box - differences in functionality made a distinct Global component necessary.
  • Legacy Global Matchcodes with PO Box will now appear as Post Box
  • Domestic matchcodes have been removed from the Global matchcode list
  • Global Address Matching: MX, DK, NZ, IE, SE added to full support status


Build 2768

Released April 2016

  • Global Address Matching: CL, NL added to full support status
  • Multi-Thread bug fixed
  • Minor bug fixes


Build 2739

Released March 2016

  • Added Numeric & Date as members of the datatype list when creating global matchcode.
  • Global Addr Matching: AT, AU, UK, CA, CH added to full support status
  • Minor bug fixes. ie Company mc component start at Word(#)
  • Changed 'Global Address' matchcode: premise number now set to 'Both blank'


Build 2628

Released August 2015

Object Compatibility Summary
Melissa Data has upgraded our windows compiler from Visual Studio 2008 to Visual Studio 2012. This means that the C++ 2012 Redistributable is required to be able to run the objects. This redistributable is installed by the setup. However, if you are manually copying over the libraries, you must first install the redistributable. You can find it in the extras/redist directory on the disc.
  • We have removed the following platforms AIX, HXUX and Solaris.
  • We have removed all 32-bit libaries and respective source code.
  • We have removed the 32-bit and 64-bit COM Objects.
Installing this version will not remove these legacy folders and files.
MatchUp is Going Global!
This version marks the beginning of integration of Global matching capabilities.
If you currently process UK data, or use a matchcode with UK components, do NOT update to this version! Global UK processing will be added next release.
Supported Global Countries: Germany
Matchcode Editor
When creating a new matchcode, the Matchcode Editor will now prompt you to specify the type of matchcode you want to create - a domestic or global matchcode. This will cause matchup to dynamically populate the list of available matchcode component types, ensuring that you do not create a matchcode with global components that does not call our global address parsing engine.
When constructing a global matchcode, you must include a Country component, so the address parser knows how to parse the address - given the wide variety of international address patterns which differs from country to country.
Interface Changes
New Methods
  • SetEncoding() - allows the user to define the input file encoded format and what the resultant format of the built keys will be.
  • SetMaximumCharacterSize() - accommodates UTF-8 input data. Since a single UTF-8 character can be up to 4 bytes long, the storage of the matchcode keys may need to be altered to accommodate this maximum size.
Added Fuzzy Algorithms
UTF-8 Near - It counts the number typos, ie character substitutions, differs from others in that it will account for character storage sizes due to different encoding.
New Matchcode Components
  • Postal Code - (Zip &/ plus 4) Complete postal code for a particular delivery point.
  • Premises Number - (Street Number) Alphanumeric indicator within premises field.
  • Double Dependent Locality - Smallest population center data element
  • Dependent Locality - (Urbanization) Smaller population center data element. Dependent on Locality.
  • Sub Administrative Area - (County) Smallest geographic data element.
  • Sub National Area - Arbitrary administrative region below that of the sovereign state.
  • Locality - (City) Most common population center data element.
  • Administrative Area - (State) Most common geographic data element.
  • Thoroughfare Leading Type - Leading thoroughfare type indicator within the Thoroughfare field.
  • Thoroughfare Pre-Directional - (Street Pre Direction) Prefix directional contained within the Thoroughfare field.
  • Thoroughfare Name - (Street Name) Name indicator within the Thoroughfare field
  • Thoroughfare Trailing Type - (Street Suffix) Trailing thoroughfare type indicator within the Thoroughfare field.
  • Thoroughfare Post-Directional - (Street Post Direction) Postfix directional contained within the Thoroughfare field.
  • Dependent Thoroughfare Pre-Directional - Prefix directional contained within the Dependent Thoroughfare field.
  • Dependent Thoroughfare Leading Type - Leading thoroughfare type indicator within the Dependent Thoroughfare field.
  • Dependent Thoroughfare Name - Name indicator within the Dependent Thoroughfare field
  • Dependent Thoroughfare Trailing Type - Trailing thoroughfare type indicator within the Dependent Thoroughfare field.
  • Dependent Thoroughfare Post-Directional - Postfix directional contained within the Dependent Thoroughfare field.
Functionality Changes
  • GetMappingItemCount - With the addition of International Address processing, GetMappingItemCount will now return ((# of non-address mappings) + 8) instead of ((# of non-address mappings) + 3). AddMapping now allows 8 address lines instead of three, although you still are only required to call one AddMapping(ADDRESS) and one AddField(Address-data).


Build 2544

Released April 2015

  • Fixed a buffer overrun in the AddMapping method.


Build 2419

Released September 2014

Windows Libraries

Melissa Data has upgraded our windows compiler from Visual Studio 2008 to Visual Studio 2012. This means that the C++ 2012 Redistributable is required to be able to run the objects.

This redistributable is installed by the setup. However, if you are manually copying over the libraries, you must first install the redistributable. You can find it in the extras/redist directory on the disc.

Interface Changes
  • NONE


Build 2395

Released July 2014

Functionality Changes
  • Improved Highway and County pattern recognition.
  • Improved hyphenated last names (spaces around the hyphen).
  • Improved address pattern recognition for uncommon dual street suffix addresses.

eMail domain recognition now handles updated domains with or withour TLD (may require added entries to mdMatchUp.cfg).

  • Issues with Components with 'Short Empty:Both' and 'Fuzzy:Containment' resolved.
  • Jaro and Jaro-Winkler crash on certain 'component size/string size' ratio fixed.
Interface Changes
  • NONE


Build 2154

Released April 2013

Functionality Changes
  • Fixed an issue when temporary sort files grew over 4GB (shown in very large processes)
Interface Changes
  • SetReserved("UserInfoSize",""), users can use to override default userinfo size of 1024 bytes per record. Smaller numbers like "32" greatly reduce disk space requirements and processing times.


Build 2009

Released July 2012

Environment
MatchUp now uses the new Melissa Data License Key format and ENV variable name.
Please call a sales representative at 1-800-MELISSA ext. 3 (1-800-635-4772 x3) for a valid License Key.
MatchUp will also check for a valid License Key in the MD_LICENSE(Environment) variable. This allows you to modify the License Key without recompiling the project
The default installation path for the MatchUp data files is now "c:\ProgramData\Melissa DATA\MatchUP\" for newer operating systems. Please check your 'Common App Data' directory for older OSs, for example: it may be "c:\Documents and Settings\All Users\Application Data\Melissa DATA\MatchUP".
Matchcode Editor
The Matchcode Editor has been redesigned. Existing matchcodes can still be read/edited/used for backwards compatibility. The new Matchcode Editor allows us to provide a common interface with the MatchUp SSIS component
Functionality Changes
Removal of certain matchcode restrictions that were required in existing version. This allows users to use an Intersecting matching logic - meaning you can create matchcodes without a common first component, making it more flexible
SQL Server Interface: There is now a set of CLR-based procedures. A seperate installer script is provided as an alternative to the _xp and UDF methods of interfacing the MatchUp library.
Interface Changes
Naming conventions for Interface Wrappers have been changed/updated to use the same naming convention as other Melissa Data Objects
Added Matchcode Component types
  • Date (days)
  • Numeric (integer unit)
  • Proximity (miles)
These new types allow you to specify, as a component property, a range for which a match will be possible if the records being compared fall within the set range. There are also new Enumerations, Mapping Targets and Matchcode Mappings for these new component types.
Added addition Fuzzy Algorithms
  • N-Gram
  • Jaro Distance
  • Jaro-Winkler Distance
  • Longest Common Substring
  • Jaccard Index
  • Dices Coefficient
  • Overlap Coefficient
  • Needleman-Wunsch
  • Smith-Waterman-Gotoh
  • MD Keyboard
  • Double Metaphone
New Interface
  • mdMUMatchcodeList - allows the user to list available matchcodes
  • SetPathToMatchUpFiles()
  • InitializeDataFiles()
  • GetInitializeErrorString()
  • GetMatchcodeCount() - retrieves number of matchcodes.
  • GetMatchcodeName() - retrieves name of matchcode at specified position.
New Methods
  • GetResults() - retrieves matching result codes.
  • GetNearDbl() - retrieves Near setting (supercedes GetNear()).
  • SetNearDbl() - sets Near setting (supercedes SetNear()).
  • GetDescription() - retrieves a matchcodes user-specified description.
  • SetDescription() - sets a matchcodes user-specified description.
  • GetNGram() - retrieves a matchcodes n-gram setting.
  • SetNGram() - sets a matchcodes n-gram setting.
  • RenameMatchcode() - change a matchcode’s name.
  • DeleteMatchcode() - delete a matchcode.
Deprecated
  • GetStatusCode and GetCombinations have been deprecated as GetResults gives the user a single output property to retrieve this information.
Sample Code
Previous examples have been changed, with GetStatusCode and GetCombintions being replaced by GetResults to reflect usage in retrieving ouput properties
Hybrid examples have been added to most samples and interfaces


Build 1459

Released August 2009

  • The MatchUp Object disc now includes all supported platforms:Linux, Windows, and Solaris, and AIX.
  • The SetUserInfo and GetUserInfo functions now work with 'const char *' data types rather than a 'void *' for all platforms/languages.
  • Removed Set/GetSizeUserInfo functions, as there was no need for them anymore. The maximum size for the UserInfo string is 1024 bytes.
  • Moved the enums into their own file mdMatchupEnums.h to make the Object more compatible with our other objects. In doing this, some enumerations were changed (see below).
  • The MatchcodeComponentType enums now have 'Comp' appended at the end of each label.
  • The MatchcodeMappingType enumeration was named to MatchcodeMappingTarget. MatchcodeMappingTarget enums all have 'Type' appended at the end of each label.
  • The ProgramStatus enum NoError has been renamed to ErrorNone.


Build 1451

Released July 2009

Initial public release
Multi Platform, new version, new interface, incompatible with legacy MatchUP API. Please refer to product page for details.
Deprecated products
  • MatchUp API, DoubleTake API


MatchUp API Build History

Release 3.13

Released December 2006

  • Fixed a crash that sometimes occurred with some programming languages when a series of processes was launched one after another.
  • Examples: Added Visual C# example projects.
  • Matchcode Editor: Added Frequency Near matching.
  • Matchcode Editor: Added a check to ensure that a user couldn't exit unless all matchcodes had at least one component and one combination.
  • ".5 Main Street" wasn't being split properly.
  • Some 'fractions' should not be treated as such. For example, "123 State Route 28/18".
  • Processing: Phone numbers such as 999-999-9999 and 000-000-0000 are now intrepreted as blank.


Release 3.12

Released July 2006

  • Processing: Matchcodes having a large key size (over 250 characters) could cause a crash. NOTE: Matchcodes having a large key size are usually a sign of a poorly designed matchcode.
  • Processing: Canadian addresses such as "101-38 Main St" should match "38 Main St Apt 101". In the US, the same address would match "101 Main St Apt 38".
  • Processing: Inferred matching (A=B, B=C, so A=C) has been improved so that: (a) Inferences are detected between differing file types (suppression, regular, etc), (b) Inferences are detected regardless of record ranking, and (c) The correct output record is selected.
  • Processing: Alphabetic PO Boxes like "PO Box AVG" are more likely to be split properly now.
  • Processing: "12 Avenue 29" is now split properly.
  • Processing: The phonetex representation of "P" (just the single letter) was incorrect.
  • Matchcode Editor: Fixed a refresh anomaly when working with a matchcode having a single component.
  • Processing: "85 State Road" wasn't being split properly.
  • Street Splitter: "816 W North Loop Rd", "3750 I 55 N", "876 FM 365 Rd", "15 FM 18", "56 E Loop 281", and "540 S Interstate 36 E" are now split properly.
  • Merge/Purge: Validate Zip matchcodes now better handle addresses having both a street address and PO Box.
  • Added "Henry", "Hank", "Chuck", "Chucky" and "Chuckie" to Nickname table.
  • Processing: Leading and trailing spaces in an e-mail's user name are now stripped.
  • Street Splitter: Certain street addresses with slashes are now handled better (as in "300 Oak St/PO Box 12").
  • Street Splitter: Addresses like "268 Hwy 202/31" are now split properly.
  • Street Splitter: Splitting of "Farm to Market" (FM) addresses is now handled better.
  • Street Splitter: Addresses with decimals were losing their decimal point.
  • Name Splitter: Ampersands (&) are now handled better.
  • Street Splitter: Added several MLK variations, BL (Boulevard or Building) and TE (Terrace).
  • Street Splitter: Improved the handling of bad addresses such as "15 Main St-18".
  • Street Splitter: Secondaries such as "Apt 15-58" are now converted into "1558" for the purposes of deduping.
  • Street Splitter: Sloppy addresses such as "15-1/2 Main St" now get a correct street number of "15.5".
  • Processing: Fixed a random crash that had been very difficult to isolate.
  • Matchcodes: Containment matching sometimes didn't consider the very last character of the shorter string in the comparison (ie, "Kim" would be contained in "Kid").
  • Matchcodes: Tweaked behavior of Fast Near matching, especially with short strings.


Release 3.11

Released September 2005

  • Street Splitting: Addresses with "St." in the street name aren't automatically converted to "Street", as in "100 St. Mary's St."
  • Street Splitting: Secondaries such as "Apt-18", "Apt 18GF", "Apt 18 GF" and "Apt GF 18" are now handled better.
  • Matchcode Editor: Components that will be used in determining clusters are displayed with a light green background. This knowledge can be used to build more optimal matchcodes. See concepts for more information about clustering.
  • Matchcode Editor: The Swap Match dialog will now use component labels if they are known.
  • Full Name Splitter: "Charles O. Leary" was winding up with a last name of "Oleary" instead of a Middle Initial of "O" and a Last Name of "Leary". The problem was specific to O' names that could appear with or without the O' prefix.
  • Government Name Splitter: Names with suffixes like "Public, John Q. MD" are now split correctly.


Release 3.10

Released April 2005

  • Matchcodes: State processing has been enhanced to convert full spellings to the 2-letter abbreviation, as well as fix some common Canadian problems ("QC" and "PQ").
  • Matchcodes: The "Stop at Word" option sometimes stopped one word too soon.
  • Swap Matching has been improved in the Half Swap configuration
  • Matchcodes: Full names such as "Mr.Johnson" (with a period but no space between the "Mr" and "Johnson") weren't being split properly.
  • Matchcodes: "Start at Position" is now performed before SoundEx, Phonetex, Consonants Only, Numerics Only, Vowels Only, and Alphas Only. For example, if "Start at Position" was 2 with "Consontants Only", "Ableson" would use "lsn", rather than "blsn".
  • The API's code has been synchronized with the recently released Double Take 3 (GUI). The previous version of the API was built in between Double Take 2 and Double Take 3. Although it had many of Double Take 3's features, it's processing core was built on Double Take 2.
  • A Java interface has been added.
  • The address splitter interface has been exposed so that users can now parse street addresses.
  • A CASS interface has been added. This optional add-on allows you to CASS validate addresses.
  • CASS-validated Zip, Plus 4 and Street Address components are now available as Matchcode Components (with CASS add-on).
  • United Kingdom City, County and Post Code are now available as Matchcode Components.
  • The Custom component type has been added. You can use this component to perform table-based substitutions while building matchcode keys.
  • You now have more control over Swap Matching.
  • One Blank Field and Initial Only matching has been improved.
  • A debugging interface has been added to help developers troubleshoot problems.
  • Double Take can now automatically optimize your matchcode with the Optimize button in the Matchcode Editor.
  • You can import Double Take 2 matchcodes with the Import button in the Matchcode Editor.
  • The SQL Server 'bridge' DLL is no longer used; synchronizing this DLL with the API's DLL was often a problem.
  • The SQL Server Extended Stored Procedures have been re-tooled to better report errors.


Release 3.01

Released December 2002

  • Added function support for SQL Server 2000.
  • Corrected typo in help file regarding MP_FIRST confusion.
  • Corrected incorrect EM_COMBOn defines.


Release 3.00

Released November 2002

  • First Release