MatchUp Software:Excel/Access Troubleshooting Guide

From Melissa Data Wiki
Jump to navigation Jump to search

← MatchUp Software


Troubleshooting Tips When Processing an Access or Excel Table

Access and Excel are difficult formats to process because they are in a proprietary Microsoft format. Because of this, we are forced to use Microsoft's prescribed methods to access data. MatchUp uses one of two technologies for doing this: ADO and DAO. This document is intended to provide steps in working around the default behavior and providing MatchUp with formats it can handle.


Add an Autonumber field to your table.

One of the biggest problems with DAO/ADO is that these technologies were written with the assumption that people always access their data in a linear fashion, from start to end. While MatchUp does do this while building keys, it must access the data in a random sequence when it is performing the merge/purge operation. Another way to explain this is to say that DAO/ADO do not use record numbers. So when MatchUp tells DAO/ADO to move to record 58, DAO/ADO uses an awkward (and apparently buggy) algorithm to accomplish this. Adding an autonumber field essentially provides a record number field for your table, saving DAO/ADO from sweating out the computations.

CREATE AN AUTONUMBER IN EXCEL

1. Create a new column

2. Navigate to bottom row and highlight last cell

3. CTRL + SHIFT + Arrow (to highlight all active cells in that column)

4. Type in the formula.... .

auto2.png

...(note '-1' so first cell can have a header created)

Each cell should now have the results of the formula

5. Highlight column and select COPY

6. Select PASTE > PASTE VALUES

The cells will now contain the row number instead of the formula

7. Select and edit the first cell and create a column name.

Do not name the Column ‘ID’ as this is a used column name and has previously caused crashes for some of our users.

auto3.png

CREATING AN AUTONUMBER IN ACCESS

1. Open the table in Access, and select Design View

auto4.png

2. Create a new column (Do not name it ‘ID’), type AutoNumber

auto5.png

3. Set its properties to increment unique values

auto6.png

4. Select Data sheet view….

auto7.png

5. Review the new column…

auto8.png

Try the other technology.

There is no way to easily get the data access version, so we can’t even say “your data access engine is outdated, get version xxxx”. Also, different versions of Excel perform better with different versions of the data access engine, while the same version of Access might be the complete opposite. In Tools | User Settings | General: Interface, you can select either DAO or ADO. Often, changing this will do the trick. Note that ADO is usually slower than DAO.

adoDao.png

Edit Output Field Mapping

By the time you configure the Output Field Mapping Tab, MatchUp is creating the structure for the Output Table based on what you have in this screen. However, at this point it does not know the Output table type (you haven’t created it yet). Since MatchUp can create output tables of many different types, each with restrictions of what constitutes a valid structure, you must create an Output Field Mapping structure which conforms with a ‘common valid structure ruleset’ which any Output type can use.

ofm1.png

FIELD SIZES If you're working with a table with rather large fields (ie, over 200 characters), you may want to try reducing the sizes to a more manageable size and see what happens.

Excel most always reports field sizes as 255 characters, attempting to modify the column width in excel will not work. Be sure that you won't truncate any of your data before doing this.

FIELD COLUMN NAMES Since some database systems have restrictions on column names, MatchUp follows suit and will not create these Output Table types if the column names are seen by MatchUp as multi-word or have a length of over 10 characters.

Excel allows duplicate field names. MatchUp tries to do what it can with these, but sometimes comes up short. It's best just to fix the problem in the native DBMS before working with the file in MatchUp. And better database practice!

A good rule of thumb – if you look at only the first 10 characters of the column names, and there are more than one instance, or embedded spaces, the program may crash.

FIELD TYPES

If the table has a few 'oddball' fields, you may want to make a copy of the table and try removing the fields. Such fields would be non-character and non-numeric fields, and not critical to the deduping process or the desired output table.

If all fields are required in Output, you can try changing the output column type. For example, different versions of Excel and Access contain options for numeric data, logic fields and dates. If you find one of these to be problematic, try changing the output type to ‘fixed character’.

For a hands off, advanced approach, you can edit your .ini file…

Warning: This is an advanced topic. Proceed only if you know what you are doing!

When MatchUp needs to create a table, it normally uses data from the chart below to determine what DBMS (ie, 'native') data type should be used for a given MatchUp field type:

field1.png

However, like many things, one size does not fit all. For Excel, Access and SQL Server users, there exists a mechanism to override these native field types.

Regardless of the database engine, you will need to open your dt.ini file (it will be in the folder where MatchUp is installed) in Notepad. At the end of the file (after a blank line or two), add the section listed.

Access or Excel through DAO:

field2.png

Above, the black entry indicates the MatchUp field type you want to change, blue indicates the Access/Excel type you want to use, and green indicates your choices for that particular field type (in the dt.ini file, do not type in the green text!).

The available native data types are the very same ones that you would use in a CREATE TABLE statement; consult your Excel or Access documentation for exactly what each data type represents.

Access or Excel through ADO: Even though we are accessing the same table types as DAO, the native data types are sometimes different (direct your complaints to Microsoft for this).

field3.png

Above, the black entry indicates the MatchUp field type you want to change, blue indicates the Access/Excel type you want to use, and green indicates your choices for that particular field type (in the dt.ini file, do not type in the green text!).

The available native data types are the very same ones that you would use in a CREATE TABLE statement; consult your Excel or Access documentation for exactly what each data type represents.


Above, the black entry indicates the MatchUp field type you want to change, blue indicates the SQL Server type you want to use, and green indicates your choices for that particular field type (in the dt.ini file, do not type in the green text!).

The available native data types are the very same ones that you would use in a CREATE TABLE statement; consult your SQL Server documentation for exactly what each data type represents.

Notes:

  • These settings are global, and are not setup file specific.
  • If you omit any entry or specify an invalid one, the default will be used.

This setup structure will process, where the previous would crash for many files…

CASS Considerations

Unlike some other database types, MatchUp CASS process does not allow you to append columns to the Excel or Access table you are processing. You must select an existing column. So before processing, you must create the columns in the respective program, using the recommended troubleshooting guidelines above.

Especially with CASS (but also Merge Purge), make sure that you are allowing zero-length (aka, 'Allow NULLs') fields.

Choose an alternate format

If all else fails, converting the files to dBASE (or .csv) will work Our dBASE driver was written in-house and is the most stable and fastest database type available for output tables.

If your Output table must be returned in its native format, it is usually quicker to process in dbase and import the output table to the native program.

Although .csv is not as fast as dBASE, and does not allow you to use some of the advanced options (to write results back to the source table), it is a much more stable option than Excel or Access.

Fixing Excel Zip Codes

What's up with Excel's handling of Zip Codes?

Wouldn't we like to know... Excel is very quirky when working with Zip Codes. Sometimes it will get convert them all to numbers. Although that's fine in the midwest, up in the northeast one or more digit of the zip code can be lost. MatchUp does a good job of converting ‘excel-ed’ zip codes to their five digit format but problems can arise if there is a mix of zip5 and zip9 codes. Since excel will report that the field is numeric, the zip9 codes containing a dash will get wiped out, and MatchUp will see these records as not containing a zip code.

One way to handle this (sometimes this won't work!):

  1. Insert a blank column next to the Zip field
  2. On the first row of data, type =TEXT(C1,0), where C1 is the cell containing the zip code.
  3. Copy the function for the rest of the column (the C1 should correctly adjust to C2, C3, etc as you go down). Your northeast Zip Codes should now read correctly.
  4. Click at the very top of the column to highlight the entire row.
  5. Right-click and select Copy.
  6. Right-click and select Paste Special. Select Values.
  7. You can delete the original column of badly formatted Zip Codes.


Trouble Processing xlsx or accdb formats

MatchUp can not process xlsx nor accdb files when 64-bit msOffice is installed. The workaround is to install 32-bit msOffice or to process the files as a csv format.