List of icons and their actions

Descriptions and purpose of each icon found under Data -> Prepare

Below is a concise reference guide for each icon found above the data pipeline.

List of icons

1. Undo

Reverts your latest pipeline change (e.g., removing a column, adding a step).

2. Redo

If you used Undo but decide to keep that action, click Redo. The undone step reappears.

3. Revert to Original Version

Discards all pipeline changes and rolls back the entire dataset to its original, pre-transformation state.

4. Export

Displays options to export or write back the current dataset to a target (e.g., CSV, HDFS, Snowflake). For more details, check out this page.

5. Functions

Performs advanced formula-based transformations, subdivided into Indicators and Signatures. For more details, check out this page.

6. SQL

Allows you to write or apply raw SQL transformations in the data pipeline. Ideal for complex joins, subqueries, and pushdown operations that merge multiple datasets or apply advanced filtering/aggregations directly within the database’s query engine. For more details, check out this page.

7. Python

Allows you to write or apply Python script-based transformations (via PySpark or Pandas) in the data pipeline. For example, sophisticated data wrangling, machine learning feature prep, or text processing. Use PySpark for large-scale distributed data or Pandas for smaller local data. For more details, check out this page.

8. Aggregate

Summarize or group data by performing aggregations (COUNT, SUM, AVERAGE, etc.). It has the following tabs:

  • Standard: Basic grouping by field, plus a date/time column if needed.

  • Time-Series: Specifically for date/time-based grouping (with “Resolution” like daily, weekly).

  • Pivot: Create a pivot table by selecting row/column dimensions and an aggregation measure.

9. Advanced Filter

Advanced filters window

Build complex multi-condition filters for the dataset pipeline. Choose the required column and the action to be applied. Use "+" to chain multiple conditions for more nuanced data curation. Once the filters are applied, only rows matching these conditions will remain in your pipeline.

10. Hierarchies

Defines hierarchical relationships and lets you organize dimension columns into logical levels, such as Country → State → City. For more details, check out this page.

Last updated

Was this helpful?