EXAMPLE - Statistical Functions
This example illustrates how you can apply statistical functions to your dataset. Calculations include average (mean), max, min, standard deviation, and variance.
Functions:
Item | Description |
---|---|
AVERAGE Function | Computes the average (mean) from all row values in a column or group. Input column can be of Integer or Decimal. |
MIN Function | Computes the minimum value found in all row values in a column. Input column can be of Integer, Decimal or Datetime. |
MAX Function | Computes the maximum value found in all row values in a column. Inputs can be Integer, Decimal, or Datetime. |
VAR Function | Computes the variance among all values in a column. Input column can be of Integer or Decimal. If no numeric values are detected in the input column, the function returns |
STDEV Function | Computes the standard deviation across all column values of Integer or Decimal type. |
NUMFORMAT Function | Formats a numeric set of values according to the specified number formatting. Source values can be a literal numeric value, a function returning a numeric value, or reference to a column containing an Integer or Decimal values. |
MODE Function | Computes the mode (most frequent value) from all row values in a column, according to their grouping. Input column can be of Integer, Decimal, or Datetime type. |
Source:
Students took a test and recorded the following scores. You want to perform some statistical analysis on them:
Student | Score |
---|---|
Anna | 84 |
Ben | 71 |
Caleb | 76 |
Danielle | 87 |
Evan | 85 |
Faith | 92 |
Gabe | 85 |
Hannah | 99 |
Ian | 73 |
Jane | 68 |
Transformation:
You can use the following transformations to calculate the average (mean), minimum, and maximum scores:
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | AVERAGE(Score) |
Parameter: New column name | 'avgScore' |
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | MIN(Score) |
Parameter: New column name | 'minScore' |
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | MAX(Score) |
Parameter: New column name | 'maxScore' |
To apply statistical functions to your data, you can use the VAR
and STDEV
functions, which can be used as the basis for other statistical calculations.
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | VAR(Score) |
Parameter: New column name | var_Score |
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | STDEV(Score) |
Parameter: New column name | stdev_Score |
For each score, you can now calculate the variation of each one from the average, using the following:
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | ((Score - avg_Score) / stdev_Score) |
Parameter: New column name | 'stDevs' |
Now, you want to apply grades based on a formula:
Grade | standard deviations from avg (stDevs) |
---|---|
A | stDevs > 1 |
B | stDevs > 0.5 |
C | -1 <= stDevs <= 0.5 |
D | stDevs < -1 |
F | stDevs < -2 |
You can build the following transformation using the IF
function to calculate grades.
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | IF((stDevs > 1),'A',IF((stDevs < -2),'F',IF((stDevs < -1),'D',IF((stDevs > 0.5),'B','C')))) |
To clean up the content, you might want to apply some formatting to the score columns. The following reformats the stdev_Score
and stDevs
columns to display two decimal places:
Transformation Name | |
---|---|
Parameter: Columns | stdev_Score |
Parameter: Formula | NUMFORMAT(stdev_Score, '##.00') |
Transformation Name | |
---|---|
Parameter: Columns | stDevs |
Parameter: Formula | NUMFORMAT(stDevs, '##.00') |
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | MODE(Score) |
Parameter: New column name | 'modeScore' |
Results:
Student | Score | modeScore | avgScore | minScore | maxScore | var_Score | stdev_Score | stDevs | Grade |
---|---|---|---|---|---|---|---|---|---|
Anna | 84 | 85 | 82 | 68 | 99 | 87.00000000000001 | 9.33 | 0.21 | C |
Ben | 71 | 85 | 82 | 68 | 99 | 87.00000000000001 | 9.33 | -1.18 | D |
Caleb | 76 | 85 | 82 | 68 | 99 | 87.00000000000001 | 9.33 | -0.64 | C |
Danielle | 87 | 85 | 82 | 68 | 99 | 87.00000000000001 | 9.33 | 0.54 | B |
Evan | 85 | 85 | 82 | 68 | 99 | 87.00000000000001 | 9.33 | 0.32 | C |
Faith | 92 | 85 | 82 | 68 | 99 | 87.00000000000001 | 9.33 | 1.07 | A |
Gabe | 85 | 85 | 82 | 68 | 99 | 87.00000000000001 | 9.33 | 0.32 | C |
Hannah | 99 | 85 | 82 | 68 | 99 | 87.00000000000001 | 9.33 | 1.82 | A |
Ian | 73 | 85 | 82 | 68 | 99 | 87.00000000000001 | 9.33 | -0.96 | C |
Jane | 68 | 85 | 82 | 68 | 99 | 87.00000000000001 | 9.33 | -1.50 | D |