Customized computation is a powerful feature provided by RATH allowing you to flexibly edit your data with regular expressions.
Customized computation console
After importing the data from your selected data source, you can click on the Customized Computation button to open the console.
The Customized Computation Console has an editor located on the upper left side, where you can enter regular expressions. RATH will prompt suggestions as you input, you can confirm the suggestion by clicking on the prompt or pressing the Tab key.
If an expression is invalid, an error message will be displayed in the lower left corner. If the expression is valid, the output preview (Distribution Charts) of the column will be updated in real-time on the right.
Customized Computation expressions refer to a field (both original and extended) by field ID. In the editor, you can enter the name of the field, after intelligent matching to the corresponding field, select auto-completion.
The data type of the field concerned in the Customized Computation expression, the original type includes the following three types:
settype: an ordered number type, representing an ordinal number, not involved in mathematical operations
grouptype: discrete number type, involved in mathematical operations
collectiontype: string type
The operator of the Customized Computation expression is strongly associated with the data type, and if necessary, use the corresponding operator to transform the column (this will construct a new column instead of changing the original column).
Customized Computation expressions are composed of operators, field references, literals and operators. Operators are the core of Customized Computation's functionality. Customized Computation expressions should use nestable operators, and generate and export at least one new field. The outermost Customized Computation expression can split multiple calculation statements with commas (",") to quickly create some independent fields for the calculation process. You should not use semicolons in expressions.
A DateTime object is a special kind of object returned by the
A direct export will generate a field with a timestamp (column of type
It can also be sliced to construct new single or multiple columns.
The slicing syntax is
<datetime object>.<dimension tag>.
Valid dimension notations include
Operators are functions that operate on fields or other objects.
Operator identifiers start with a
The calling syntax for an operator is
<operator>(<parameter 1>, <parameter 2>...).
For a new extended column generated by a calculation, add the
out keyword before the calculation statement to export it.
out operator can be placed at any level of a nested operation.
**Customized Computation expressions must contain at least one
**out** statement. **
You can add a word without special characters after the
out keyword as the name of the exported field.
In particular, For the field generated by the four arithmetic operations, its name must be explicitly declared.
$set(group|collection) -> set
Converts a column of the type set from a column of non-set type.
$group(set|collection) -> group
Transforms a column of type group from a column of type non-group. Useful when performing mathematical operations on fields (you need to be sure that the operations make sense).
$nominal(set|group) -> collection
Transforms a column of type collection from a column of non-collection type.
Ordinal number generation
$id() -> set
Generate IDs starting from 1.
$order(set|group) -> set
Generates the mathematical order (starting at 1) of all rows on a field.
$dict(collection) -> set
Generates the lexicographic order (starting at 1) of all rows on a field.
$inset(group) -> group
Standardize a field to the interval -1 ~ 1.
$bound(group) -> group
Standardize a field to the interval 0 ~ 1.
$normalize(group) -> group
Normalize a field using Z-score.
$log(group) -> group,
$log(group, JS.number) -> group
Logarithmic mapping, the base can be provided, and the default is the natural logarithm.
$log2(group) -> group
$log10(group) -> group
$log1p(group) -> group
$log(group + 1).
$sigmoid(group) -> group
$ReLU(group) -> group
$isNaN(set|group) -> collection
Returns "1" if the row is NaN on a field, or "0" if not.
$isZero(set|group) -> collection
Returns "1" if the line is 0 on a certain field, or "0" if not.
$zeroFill(group) -> group
Map outliers (±Infinity | NaN ) of the row on a field to 0.
$meanFill(group) -> group
Maps outliers (±Infinity|NaN) for this row on a field to the mean of non-outliers.
$nearestClip(group, JS.number, JS.number) -> group
Provide a range, and the value of the row on a certain field that exceeds this range is mapped to the nearest boundary value as an outlier.
$meanClip(group, JS.number, JS.number) -> group
Provide a range, and the value of the row on a field that is outside this range is mapped to the average value of the non-outlier values as outliers.
$boxClip(group) -> group
Use the boxplot statistics to mark outliers in a field and replace them with NaN.
$concat(collection, ...collection) -> collection,
$concat(JS.string, collection, ...collection) -> collection
Concatenate the contents of several string columns sequentially with a delimiter (
, by default) as a new column.
This operator introduces a special kind of DateTime object and it returns an instance of that object.
In this case, we attempt to transform the
casual column from the Bike Sharing Demo Database with nonlinear mapping.
Retrieve DateTime information
We can retrieve the DateTime information from the Bike Sharing Demo Database, where
_c_4 is the year and
_c_1 is the month.
Next, we can calculate the