Pentaho/Contact Zone:Cleanser Tutorial
← Data Quality Components for Pentaho
Cleanser Navigation | |||
---|---|---|---|
Overview | |||
Tutorial | |||
Advanced Configuration | |||
| |||
Expression Builder | |||
Result Codes |
The following steps will guide you in the basic usage of Generalized Cleanser Component.
Add Component
To add Generalized Cleanser Component to your project, drag the component onto the Data Flow screen. This will snap the Generalized Cleanser Component into your workflow space.
Connect Input
Select a data flow source to be your input data. Many formats can be used as Sources, including Excel files, flat files or Access Input data sources. Connect this data source to the Generalized Cleanser Component by dragging the arrow from your data flow source to the Generalized Cleanser Component.
Configure Component
Double click the Generalized Cleanser Component to bring up the interface.
Input Field Tab
Map the input fields and choose your cleansing operations.
Output Filter Tab
You can specify which groups or columns you want to be output.
Connect Output
Add data destinations for downstream output. Connect the respective output filter pin to the output destination.
Save Settings
Click File and select Save Selected Items to save the project
Run Project
Now, the project is ready to run.