SSIS:Generalized Cleanser:Component

From Melissa Data Wiki
Jump to navigation Jump to search

← SSIS:Data Quality Components

Generalized Cleanser Navigation
Overview
Tutorial
Advanced Configuration
Component
Expression Builder
Result Codes




Cleansing Details

These are the operations you are creating to cleanse your data. The following operations are available for your use:

Case
Cleanses the casing for your data (e.x. MeLiSa DaTA → Melissa Data)
Punctuation
Cleanses the punctuation for your data (e.x. FIND EXAMPLE)
Expression
Removes your data and replaces with an expression
Regular Expression Search Replace
Searches for an expression and will replace that expression with another expression or data. (E.x. Resumé → Resum)

Adding a Search & Replace Table

The Text Search Replace Operation has the functionality to add a Search & Replace table.
The Search & Replace table is used as a dictionary that contains the values to search for and the values to replace them with. Generalized Cleanser will then use the Search & Replace table during processing to make the updates to the values contained in the table. It must adhere to the following format:
  • Regular Search Expression, followed by a TAB, the replace expression and a CR/LF.
See the example below: It contains vehicle models to search for and the values to replace it with.
  1. Select the Source Field you would like to apply the Search & Replace table to.
  2. Press the + sign to add an Operation and Select the Operation Text Search Replace.
  3. Then select the Use Search & Replace Table File option.
  4. Enter the path to the search & replace table or select using the Folder option.
  5. Once added and the options are selected, press ok.
Text Search Replace
Searches for a string to replace with another string. (E.x. Volvo → Toyota)


The Generalized Cleanser takes into account order of operations. Meaning it will execute all operations in a rule in order from top to bottom. Once you’ve finished creating your operations, we suggest you save your operations within a rule with a name and short description for future use.


Triggers

There are 3 triggers that you can use to determine whether or not a rule should be used for a record:

None
This will trigger on every record.
Expression
Only trigger when a record matches this expression.
Regular Expression
Only trigger when a record matches this regular expression.