Workflow Data Preparation Tools
Workflow Data Preparation Tools support sampling, cleansing, and filtering of your data in Designer Cloud.
Workflow Data Preparation Tools
Item | Description |
---|---|
Use Auto Column to automatically change the column type and size for efficient storage of string data. | |
Use Create Sample to split the input records into 2 or 3 random samples. | |
Use Data Cleansing to fix common data quality issues. You can replace null values, remove punctuation, modify capitalization, and more. | |
Use Filter to select data using a condition. | |
Use Formula to create new columns, update columns, and use 1 or more expressions to perform a variety of calculations and operations. | |
Use Generate Rows to create new rows of data with an expression. | |
Use Imputation to clean up missing values in your data. | |
Use Multi-Column Binning to tile or bin on multiple columns. | |
Use Multi-Column Formula to create or update multiple columns using a single expression. | |
Use Multi-Row Formula to create and update columns by using row data to create formulas. | |
Use Oversample Column to automatically create balanced samples from imbalanced data for use in statistical modeling. | |
Use Random % Sample to return an expected number of rows that result in a random sample of the incoming data stream. | |
Use Row ID to create a new column in the data and assign a unique identifier, which increments sequentially for each row in the data. | |
Use Sample to limit the data stream to a specified number, percentage, or random set of rows. In addition, the Sample tool applies the selected configuration to the columns you want to group by. | |
Use Select to include, exclude, and reorder the columns of data that pass through your workflow. | |
Use Select Rows to return rows and ranges of rows that are specified, including discontinuous ranges of rows. This tool is useful for troubleshooting and sampling. | |
Use Sort to arrange the rows in a table in alphanumeric order based on the values of the specified data fields. | |
Use Tile to assign a value (tile) based on ranges in the data. The tool does this based on the user specifying 1 of 3 methods. | |
Use Unique to distinguish whether a row is unique or a duplicate by grouping on one or more specified columns, then sorting on those columns. |