Pentaho:Cleanser Tutorial

From Melissa Data Wiki
Revision as of 22:17, 7 September 2016 by Admin (talk | contribs) (Created page with "{{PentahoCleanserNav}} {{CustomTOC}} The following steps will guide you in the basic usage of Generalized Cleanser for Pentaho. ==Add Component== To add Generalized Cleans...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

← Data Quality Components for Pentaho

Cleanser Navigation
Overview
Tutorial
Advanced Configuration
Cleanser Tabs
Input Fields
Output Filter
Expression Builder
Result Codes



The following steps will guide you in the basic usage of Generalized Cleanser for Pentaho.

Add Component

To add Generalized Cleanser Component to your project, drag the component onto the Data Flow screen. This will snap the Generalized Cleanser Component into your workflow space.

PENT Cleanser Tutorial Component.png


Connect Input

Select a data flow source to be your input data. Many formats can be used as Sources, including Excel files, flat files or Access Input data sources. Connect this data source to the Generalized Cleanser Component by dragging the arrow from your data flow source to the Generalized Cleanser Component.

PENT Cleanser Tutorial Source.png


Configure Component

Double click the Generalized Cleanser Component to bring up the interface.

Input Field Tab

Map the input fields and choose your cleansing operations.

PENT Cleanser InputFields.png


Output Filter Tab

You can specify which groups or columns you want to be output.

PENT Cleanser OutputFilter.png


Connect Output

Add data destinations for downstream output. Connect the respective output filter pin to the output destination.

PENT Cleanser Tutorial Output.png


Save Settings

Click File and select Save Selected Items to save the project

PENT Tutorial Save.png


Run Project

Now, the project is ready to run.