MatchUp Hub:Environment

From Melissa Data Wiki
Revision as of 14:18, 2 October 2018 by Admin (talk | contribs)
Jump to navigation Jump to search

← MatchUp Hub

Have the Latest Build

Have the latest Build of the respective Melissa Data Solution.

We always recommend users use the most current build (version), which may contain new or improved features, code fixes, or updated functionality. Melissa data provides many ways to ensure you keep your components to the latest version.


Run on a Local Instance

VM support

Network Speed

Proprietary network configurations can vary greatly, therefore preventing us from making universal recommendations. The Best Practice is to run MatchUp as a local Installation of the application (calling program) with respect to the source data location. If your production environment requires independent application location with regard to source data, we recommend you first create local benchmarks and migrate to a production configuration -testing thoroughly along the way.

Memory management

It goes without saying that the more hardware – a faster processor, additional memory and a fast disk drive will produce faster throughput than a lesser machine.
The MatchUp data files are loaded into memory but given their size with respect to common memory configurations, memory considerations are negligible.
Your source data is not loaded into memory as a typical source database can be in the GB range.

Process Work Files

As MatchUp processes your database, it produces a matchkey for each record, sorts the keys for efficient deduping, and for some distributions, such as the ETL solutions, must keep track of source pass thru data. Using the default installation settings, these files will be located here:
C:\Users\MyUserName\AppData\Local\Temp
The work files are in the format of:
myKeyFileName.key - Key File Example
md29F8.002        - Temporary Sort File Example
mpAB6.tmp         - Passthrough File Example
This is the temp directory of the logged in User. For *nix platforms, the directory where the executable is being ran.
Although users can override this location, we do not recommend it, unless you are pointing this location to a fast, local drive with plenty of writable disk space and full read write permissions.


ETL: SSIS, Pentaho

VM

Please avoid utilizing the option Hot Swap CPU when hosting with SQL Server 2016 with VMWare. We discovered that this is against the best practices from VMWare (3.3.6 CPU Hot Plug). Engineers from both VMware and Microsoft has verified that this will lead to issues with thread contention. The setting MAXDOP if set to too high will also have an effect on the contention and SSIS processes/our components may hang especially with our Matchup Component.
As reported from our client, reports show that when querying SQL Server Numa in DMV (sys.dm_os_memory_nodes) only one node will be allocated 100% memory. After disabling Hot Swap CPU, SQL Server now correctly detects a single numa. Disabling Hot Swap CPU will increase performance and should prevent hanging issues from occurring in SSIS.
This Hot Swap CPU issue is only reported for SQL Server 2016 edition, which we may think is caused by the new cardinality estimator and automatic soft numa default features. The combination of new changes and incorrect setup in VMware may have caused the thread waits past a certain threshold.

Supported SQL, VS, and SSDT / BIDS compatibility

Using an unsupported version may no longer produce stable processing.
Melissa Data currently supports SQL Server versions 2012, 2014, 2016 and 2017. We also provide components for SQL Server 2005, 2008 and 2010 as part of the SSIS Installation, however, they are no longer supported.
Certain Microsoft SQL Server version are officially supported by specific Microsoft Visual Studio versions for our components. BusinessCoder and future new component will only be offered in SSIS 2010 and newer.

Officially Supported by Melissa Data

Multiple SQL installations

If you installed the components for one instance of SQL and are running another, you may have an older build or mismatched libraries and / or data files, which may result in sub optimal performance or component hanging.
If one version is installed and you attempt to create a project using another version, changing your SSIS component to run in compatibility mode will not ensure that the correct library is used at runtime.


64 bit installations? Running in 64 bit mode?

The MatchUp processor is designed to run in 64 bit mode. Confirm that your configuration takes advantage in your project settings:
Project> Configuration > Run64BitRuntime = True
When processing takes place, open Task Manager – and confirm that you are running on 64 bit mode

Component Option Configurations

  • Un-Needed Passthrough fields.
  • Golden Record/ Survivorship used.
  • Added Lookup stream process.
For these advanced configuration options, after the component has sorted the keys, deduped, and grouped the records, it will have to access all the cached temp file source data, and decide which columns to stream, roll-up, or use, to determine the golden record. All of which consume more processing time.
For source files with a large number of columns, we recommend that you configure your data source with a unique identifier and forego passing in columns that are unnecessary to the process. After processing is complete, canned ETL data tasks can easily use the unique identifier to link the Melissa Data results to the original data source record.
If the process requires only that all source records need to be streamed to output with a simple record disposition and group identifier, don’t configure Golden Record – the DupeGroup identifier and Result Code disposition are sufficient to query record groupings.