Category
Function
Calculate statistics on data associated with a categorical component
Syntax
statistics = CategoryStatistics(input, operation, category, data, lookup);
Inputs
Name | Type | Default | Description |
---|---|---|---|
input | field | (none) | field for which to compute statistics |
operation | string | "count" | operation to perform ("count", "mean", "sd", "var", "min", "max") |
category | string | "data" | component with categorical values |
data | string | "data" | data component for statistics |
lookup | integer, string, value list | "category lookup" | lookup component |
Outputs
Name | Type | Description |
---|---|---|
statistics | field | field with data containing the statistics and positions for the category values |
Functional Details
input | field containing the categorical and data components |
operation | calculation to perform |
category | component with categorical values. This component must be an integer type (int, ubyte, ...) |
data | data component for statistics. This component must be scalar. |
lookup | lookup component (optional) |
CategoryStatistics calculates statistics on a scalar component associated with a categorical component. If the operation is "count", the data component is ignored and the number of counts in each category is calculated, corresponding to a histogram of the unique values in the categorized component.
For example, if input is a Field with component "state" containing the entries {1,0,1,2,3}, component "state lookup" containing the entries {"CA", "NY", "PA", "VA"}, and a component "sales" containing the entries {1.2,1.0,1.4,1.7,1.8}, then CategoryStatistics(input,"mean","state","sales") will produce an output field where the "positions" component will contain the indices {0,1,2,3} and the "data" component will contain the mean value for sales for each state, that is {1.0,1.3,1.7,1.8}.
The output of CategoryStatistics is a field with a "positions" component corresponding to the categorical indices, and a "data" component corresponding to the requested statistics. The "positions" component will consist of the integers 0 to N-1, where N can be determined in a number of ways:
Components
Creates an output field with a "positions" component representing the categorical indices, and a "data" component containing the requested statistics. Creates a "categoryname lookup" component if a lookup table is specified using the lookup parameter.
Example Visual Programs
Duplicates.net Zipcodes.net
See Also
Categorize, Statistics, Lookup
[ OpenDX Home at IBM | OpenDX.org ]