Skip to main content

Workflow Data Preparation Tools

Workflow Data Preparation Tools support sampling, cleansing, and filtering of your data in Designer Cloud.

Workflow Data Preparation Tools

Item

Description

Auto Column Tool

Use Auto Column to automatically change the column type and size for efficient storage of string data.

Create Samples Tool

Use Create Sample to split the input records into 2 or 3 random samples.

Data Cleansing Tool

Use Data Cleansing to fix common data quality issues. You can replace null values, remove punctuation, modify capitalization, and more.

Filter Tool

Use Filter to select data using a condition.

Formula Tool

Use Formula to create new columns, update columns, and use 1 or more expressions to perform a variety of calculations and operations.

Generate Rows Tool

Use Generate Rows to create new rows of data with an expression.

Imputation Tool

Use Imputation to clean up missing values in your data.

Multi-Column Binning Tool

Use Multi-Column Binning to tile or bin on multiple columns.

Multi-Column Formula Tool

Use Multi-Column Formula to create or update multiple columns using a single expression.

Multi-Row Formula Tool

Use Multi-Row Formula to create and update columns by using row data to create formulas.

Oversample Column Tool

Use Oversample Column to automatically create balanced samples from imbalanced data for use in statistical modeling.

Random % Sample Tool

Use Random % Sample to return an expected number of rows that result in a random sample of the incoming data stream.

Row ID Tool

Use Row ID to create a new column in the data and assign a unique identifier, which increments sequentially for each row in the data.

Sample Tool

Use Sample to limit the data stream to a specified number, percentage, or random set of rows. In addition, the Sample tool applies the selected configuration to the columns you want to group by.

Select Tool

Use Select to include, exclude, and reorder the columns of data that pass through your workflow.

Select Rows Tool

Use Select Rows to return rows and ranges of rows that are specified, including discontinuous ranges of rows. This tool is useful for troubleshooting and sampling.

Sort Tool

Use Sort to arrange the rows in a table in alphanumeric order based on the values of the specified data fields.

Tile Tool

Use Tile to assign a value (tile) based on ranges in the data. The tool does this based on the user specifying 1 of 3 methods.

Unique Tool

Use Unique to distinguish whether a row is unique or a duplicate by grouping on one or more specified columns, then sorting on those columns.