Dataset Scripting
For complex transformations involving multiple datasets
Last updated
Was this helpful?
For complex transformations involving multiple datasets
Last updated
Was this helpful?
Scripting is designed for technical users who need fine-grained control over how data is merged, transformed, or aggregated—often in scenarios that exceed basic point-and-click merges. Scripting is essential for real-world use cases, especially in industries like pharma or finance where fact tables can have millions of rows. You can do partial aggregates, window functions, or unusual merges. Here, you can:
Write custom SQL (including complex joins, filters, and aggregations).
Combine more than two datasets in a single script (unlike Data Fusion, which is limited to two).
Data Fusion only merges two datasets. In Scripting, you can join 3, 5, or 10 tables at once, with custom logic.
Perform window functions, subqueries, partial aggregates, or user-defined transformations not possible in simpler GUI tools.
Real-world scenarios (e.g., multiple dimension lookups, conditional merges, pivoting, etc.) often require more power than a point-and-click interface provides.
While there are memory constraints for extremely large datasets, the scripting engine allows pushdown optimization (where possible) to process large volumes of data on the backend database or cluster.
Once you’ve built a script, you can reuse it or adapt it for scheduled processes, ensuring consistent transformations over time.
Under Data → Prepare → Script, you can find the following window.
Available Datasets: Lists all datasets in the system. You can search or scroll to find those you need to include in your script. Select the required datasets from this list to move into the Selected Datasets section.
Selected Datasets: Shows which datasets you plan to work with in your script. Click the “X” to remove a dataset. You can select multiple datasets—unlike Data Fusion, which only allows two.
Output Dataset (Name): Provide a new dataset name for the resulting dataset. Click on Create Script button and you will be switched to the Columns tab.
SQL editor (Main panel): Here, you write your SQL script. You might do multiple joins, subqueries, window functions, or partial aggregates. You can refer the datasets and their columns under “Selected Datasets”.
Click on the pen icon near "New Data" to rename the SQL script.
Once the code is ready, click on Run Validation to validate the code. When the validation is in process, the Running Validation message is displayed.
Tellius validates the entered query, and if any errors are found, they will be displayed in the bottom section of the window.
If the code is correct, the validation result is shown with a Successfully Validated message at the top.
Tellius validates the entered query, and if any errors are found, they will be displayed in the bottom section of the window.
After clearing the errors, click on Create Dataset to save and execute your script, producing the final dataset.