EXAMPLE - LIST and UNIQUE Function
This example demonstrates you to extract values from one column of an array into a new column.
Functions:
Item | Description |
---|---|
LIST Function | Extracts the set of values from a column into an array stored in a new column. This function is typically part of an aggregation. |
UNIQUE Function | Extracts the set of unique values from a column into an array stored in a new column. This function is typically part of an aggregation. |
DATEFORMAT Function | Formats a specified Datetime set of values according to the specified date format. Source values can be a reference to a column containing Datetime values. |
You have the following set of orders for two months, and you are interested in identifying the set of colors that have been sold for each product for each month and the total quantity of product sold for each month.
Source:
OrderId | Date | Item | Qty | Color |
---|---|---|---|---|
1001 | 1/15/15 | Pants | 1 | red |
1002 | 1/15/15 | Shirt | 2 | green |
1003 | 1/15/15 | Hat | 3 | blue |
1004 | 1/16/15 | Shirt | 4 | yellow |
1005 | 1/16/15 | Hat | 5 | red |
1006 | 1/20/15 | Pants | 6 | green |
1007 | 1/15/15 | Hat | 7 | blue |
1008 | 4/15/15 | Shirt | 8 | yellow |
1009 | 4/15/15 | Shoes | 9 | brown |
1010 | 4/16/15 | Pants | 1 | red |
1011 | 4/16/15 | Hat | 2 | green |
1012 | 4/16/15 | Shirt | 3 | blue |
1013 | 4/20/15 | Shoes | 4 | black |
1014 | 4/20/15 | Hat | 5 | blue |
1015 | 4/20/15 | Pants | 6 | black |
Transformation:
To track by month, you need a column containing the month value extracted from the date:
Transformation Name | |
---|---|
Parameter: Columns | Date |
Parameter: Formula | DATEFORMAT(Date, 'MMM yyyy') |
You can use the following transform to check the list of unique values among the colors:
Transformation Name | |
---|---|
Parameter: Row labels | Date |
Parameter: Values | unique(Color, 1000) |
Parameter: Max number of columns to create | 10 |
Date | unique_Color |
---|---|
Jan 2015 | ["green","blue","red","yellow"] |
Apr 2015 | ["brown","blue","red","yellow","black","green"] |
Delete the above transform.
You can aggregate the data in your dataset, grouped by the reformatted Date
values, and apply the LIST
function to the Color
column. In the same aggregation, you can include a summation function for the Qty
column:
Transformation Name | |
---|---|
Parameter: Row labels | Date |
Parameter: Values | list(Color, 1000),sum(Qty) |
Parameter: Max number of columns to create | 10 |
Results:
Date | list_Color | sum_Qty |
---|---|---|
Jan 2015 | ["green","blue","blue","red","green","red","yellow"] | 28 |
Apr 2015 | ["brown","blue","red","yellow","black","blue","black","green"] | 38 |