MatchUp Hub:Environment

From Melissa Data Wiki
Jump to navigation Jump to search

← MatchUp Hub

Have the Latest Build

Have the latest Build of the respective Melissa Data Solution.

We always recommend users use the most current build (version), which may contain new or improved features, code fixes, or updated functionality. Melissa data provides many ways to ensure you keep your components to the latest version.


Run on a Local Instance

VM support

Network Speed

Proprietary network configurations can vary greatly, therefore preventing us from making universal recommendations. The Best Practice is to run MatchUp as a local Installation of the application (calling program) with respect to the source data location. If your production environment requires independent application location with regard to source data, we recommend you first create local benchmarks and migrate to a production configuration -testing thoroughly along the way.

Memory management

It goes without saying that the more hardware – a faster processor, additional memory and a fast disk drive will produce faster throughput than a lesser machine.
The MatchUp data files are loaded into memory but given their size with respect to common memory configurations, memory considerations are negligible.
Your source data is not loaded into memory as a typical source database can be in the GB range.

Process Work Files

As MatchUp processes your database, it produces a matchkey for each record, sorts the keys for efficient deduping, and for some distributions, such as the ETL solutions, must keep track of source pass thru data. Using the default installation settings, these files will be located here:
C:\Users\MyUserName\AppData\Local\Temp
The work files are in the format of:
myKeyFileName.key - Key File Example
md29F8.002        - Temporary Sort File Example
mpAB6.tmp         - Passthrough File Example
This is the temp directory of the logged in User. For *nix platforms, the directory where the executable is being ran.
Although users can override this location, we do not recommend it, unless you are pointing this location to a fast, local drive with plenty of writable disk space and full read write permissions.


ETL: SSIS, Pentaho

VM

Please avoid utilizing the option Hot Swap CPU when hosting with SQL Server 2016 with VMWare. We discovered that this is against the best practices from VMWare (3.3.6 CPU Hot Plug). Engineers from both VMware and Microsoft has verified that this will lead to issues with thread contention. The setting MAXDOP if set to too high will also have an effect on the contention and SSIS processes/our components may hang especially with our Matchup Component.
As reported from our client, reports show that when querying SQL Server Numa in DMV (sys.dm_os_memory_nodes) only one node will be allocated 100% memory. After disabling Hot Swap CPU, SQL Server now correctly detects a single numa. Disabling Hot Swap CPU will increase performance and should prevent hanging issues from occurring in SSIS.
This Hot Swap CPU issue is only reported for SQL Server 2016 edition, which we may think is caused by the new cardinality estimator and automatic soft numa default features. The combination of new changes and incorrect setup in VMware may have caused the thread waits past a certain threshold.

SQL Server File Names vs Versions

File Name SQL Server Version Number
80 SQL Server 2000 8.00.xxxx
90 SQL Server 2005 9.00.xxxx
100 SQL Server 2008 10.00.xxxx
105 SQL Server 2008 R2 10.50.xxxx
110 SQL Server 2012 11.00.xxxx
120 SQL Server 2014 12.00.xxxx
130 SQL Server 2016 13.00.xxxx
140 SQL Server 2017 14.00.xxxx
150 SQL Server 2019 15.00.xxxx
Having a file with one of these names does not mean you have the corresponding SQL Server installed. Some files can be installed by newer versions of SQL Server, Visual Studio, and other tools.
To find your currently installed version of SQL Server:
  1. Go to Start Menu
  2. Select the newest version of Microsoft SQL Server 20xx
  3. Select Menu
  4. Select Configuration Tools > SQL Server Configuration Manager
  5. Select the SQL Server Services tab
  6. Sort by Service Type
  7. For each SQL Server type:
    1. Right-Click the SQL Server item
    2. Select Properties
    3. Select the Advanced tab
    4. Find the Version field.
This is the Microsoft SQL Server sub directory where the SSIS installer will install the MatchUp component and processor libraries.

What Versions of SQL Server/Visual Studio are Supported?

Using an unsupported version may no longer produce stable processing.
Melissa Data currently supports SQL Server versions 2012, 2014, 2016 and 2017. We also provide components for SQL Server 2005, 2008 and 2010 as part of the SSIS Installation, however, they are no longer supported.
Certain Microsoft SQL Server version are officially supported by specific Microsoft Visual Studio versions for our components. BusinessCoder and future new component will only be offered in SSIS 2010 and newer.

Officially Supported by Melissa Data

Multiple SQL installations

If you installed the components for one instance of SQL and are running another, you may have an older build or mismatched libraries and / or data files, which may result in sub optimal performance or component hanging.
If one version is installed and you attempt to create a project using another version, changing your SSIS component to run in compatibility mode will not ensure that the correct library is used at runtime.

SSIS Compatibility Issues

There can be compatibility issues after installing the correct version of our components with respect to your configured installation of SSIS and SQL. Please visit the following page to see the known issues.

64 bit Considerations for Integration Services

The MatchUp processing engine is a 64 bit library. The SSIS front-end user interface is a 32 bit application. You can see this in the task manager. See the following link for an article of some of the 32 bit and 64 bit considerations.

64 bit installations? Running in 64 bit mode?

The MatchUp processor is designed to run in 64 bit mode. Confirm that your configuration takes advantage in your project settings:
Project> Configuration > Run64BitRuntime = True
When processing takes place, open Task Manager – and confirm that you are running on 64 bit mode
See the following link for an article on the simplest way to ensure your SSIS project is running in 64 bit mode.

Cannot Debug SSIS packages in 64 bit mode

The following link has an article showing an unsupported workaround.

Component Option Configurations

  • Un-Needed Passthrough fields.
  • Golden Record/ Survivorship used.
  • Added Lookup stream process.
For these advanced configuration options, after the component has sorted the keys, deduped, and grouped the records, it will have to access all the cached temp file source data, and decide which columns to stream, roll-up, or use, to determine the golden record. All of which consume more processing time.
For source files with a large number of columns, we recommend that you configure your data source with a unique identifier and forego passing in columns that are unnecessary to the process. After processing is complete, canned ETL data tasks can easily use the unique identifier to link the Melissa Data results to the original data source record.
If the process requires only that all source records need to be streamed to output with a simple record disposition and group identifier, don’t configure Golden Record – the DupeGroup identifier and Result Code disposition are sufficient to query record groupings.