Tellius
  • ๐ŸšฉGetting Started
    • ๐Ÿ‘‹Say Hello to Tellius
      • Glossary
      • Tellius 101
      • Navigating around Tellius
      • Guided tours for quick onboarding
    • โšกQuick Start Guides
      • Search
      • Vizpads (Explore)
      • Insights (Discover)
    • โœ…Best Practices
      • Search
      • Vizpads (Explore)
      • Insights (Discover)
      • Predict
      • Data
    • โฌ‡๏ธInitial Setup
      • Tellius architecture
      • System requirements
      • Installation steps for Tellius
      • Customizing Tellius
    • Universal Search
    • ๐Ÿ Tellius Home Page
  • Kaiya
    • โ™Ÿ๏ธUnderstanding AI Agents & Agentic Flows
      • Glossary
      • Composer
      • ๐Ÿ—๏ธTriggering an agentic workflow
      • The art of possible
      • Setting up LLM for Kaiya
    • ๐ŸคนKaiya conversational AI
      • โ“FAQs on Kaiya Conversations
      • Triggering Insights with "Why" questions
      • Mastering Kaiya conversational AI
  • ๐Ÿ”Search
    • ๐Ÿ‘‹Get familiar with our Search interface
    • ๐Ÿค”Understanding Tellius Search
    • ๐Ÿ“Search Guide
    • ๐Ÿš€Executing a search query
      • Selecting a Business View
      • Typing a search query
      • Constructing effective search queries
      • Marketshare queries
    • ๐Ÿ”‘Analyzing search results
      • Understanding search results
      • Search Inspector
      • Time taken to execute a query
      • Interacting with the resulting chart
    • ๐Ÿ“ŠKnow your charts in Tellius
      • Understanding Tellius charts
      • Variations of a chart type
      • Building charts from Configuration pane
      • List of chart-specific fields
      • Adding columns to fields in Configuration pane
      • Absolute and percentage change aggregations
      • Requirements of charts
      • Switching to another chart
      • Formatting charts
      • Advanced Analytics
      • Cumulative line chart
    • ๐Ÿง‘โ€๐ŸซHelp Tellius learn
    • ๐Ÿ•ต๏ธโ€โ™‚๏ธSearch history
    • ๐ŸŽ™๏ธVoice-driven search
    • ๐Ÿ”ดLive Query mode
  • ๐Ÿ“ˆVizpads (Explore)
    • ๐Ÿ™‹Meet Vizpads!
    • ๐Ÿ‘‹Get familiar with our Vizpads
    • #๏ธโƒฃMeasures, dimensions, date columns
    • โœจCreating Vizpads
    • ๐ŸŒApplying global filters
      • Filters in multi-BV Vizpads
      • Filters using common columns
    • ๐Ÿ“ŒApplying local filters
    • ๐Ÿ“…Date picker in filters
      • Customizing the calendar view
    • โœ…Control filters
      • Multi-select list
      • Single-select list
      • Range slider
      • Dropdown list
    • ๐Ÿ‘๏ธActions in View mode
      • Interacting with the charts
    • ๐Ÿ“Actions in Edit mode
      • ๐Ÿ—จ๏ธViz-level actions
    • ๐Ÿ”งAnomaly management for line charts
      • Instance level
      • Vizpad level
      • Chart level
    • โณTime taken to load a chart
      • Instance level
      • Vizpad level
      • Chart level
    • โ™Ÿ๏ธWorking with sample datasets
    • ๐Ÿ”Swapping Business View of charts
      • Swapping only the current Vizpad
      • Swapping multiple objects
      • Configuring the time of swap
    • ๐Ÿค–Explainable AI charts
  • ๐Ÿ’กInsights (Discover)
    • ๐Ÿ‘‹Get familiar with our Insights
    • โ“Understanding the types of Insights
    • ๐Ÿ•ต๏ธโ€โ™‚๏ธDiscovery Insights
    • โž•How to create new Insights
      • ๐Ÿ”›Creating Discovery Insight
      • ๐Ÿ”‘Creating Key Driver Insights
      • ใ€ฐ๏ธCreating Trend Insights
      • ๐Ÿ‘ฏCreating Comparison Insights
    • ๐ŸงฎThe art of selecting columns for Insights
      • โžก๏ธHow to include/exclude columns?
  • ๐Ÿ”ขData
    • ๐Ÿ‘‹Get familiar with our Data module
    • ๐Ÿฅ‚Connect
    • ๐ŸชนCreate new datasource
      • Connecting to Oracle database
      • Connecting to MySQL database
      • Connecting to MS SQL database
      • Connecting to Postgres SQL database
      • Connecting to Teradata
      • Connecting to Redshift
      • Connecting to Hive
      • Connecting to Azure Blob Storage
      • Connecting to Spark SQL
      • Connecting to generic JDBC
      • Connecting to Salesforce
      • Connecting to Google cloud SQL
        • Connecting to a PostgreSQL cloud SQL instance
        • Connecting to an MSSQL cloud SQL instance
        • Connecting to a MySQL Cloud SQL Instance
      • Connecting to Amazon S3
      • Connecting to Google BigQuery
        • Steps to connect to a Google BigQuery database
      • Connecting to Snowflake
        • OAuth support for Snowflake
        • Integrating Snowflake with Azure AD via OAuth
        • Integrating Snowflake with Okta via OAuth
        • Azure PrivateLink
        • AWS PrivateLink
        • Best practices
      • Connecting to Databricks
      • Connecting to Databricks Delta Lake
      • Connecting to an AlloyDB Cluster
      • Connecting to HDFS
      • Connecting to Looker SQL Interface
      • Loading Excel sheets
      • ๐ŸšงUnderstanding partitioning your data
    • โณTime-to-Live (TTL) and Caching
    • ๐ŸŒทRefreshing a datasource
    • ๐ŸชบManaging your datasets
      • Swapping datasources
    • ๐ŸฃPreparing your datasets
      • ๐ŸคพActions that can be done on a dataset
      • Data Pipeline
      • SQL code snippets
      • โœ๏ธWriteback window
      • ๐ŸงฉEditing Prepare โ†’ Data
      • Handling null or mismatched values
      • Metadata view
      • List of icons and their actions
        • Functions
        • SQL Transform
        • Python Transform
        • Standard Aggregation
        • Creating Hierarchies
      • Dataset Scripting
      • Fusioning your datasets
      • Scheduling refresh for datasets
    • ๐ŸฅPreparing your Business Views
      • ๐ŸŒŸCreate a new Business View
      • Creating calculated columns
      • Creating dynamic parameters
      • Scheduling refresh for Business Views
      • Setting up custom calendars
    • Tellius Engine: Comparison of In-Memory vs. Live Mode
  • Feed
    • ๐Ÿ“ฉWhat is a Feed in Tellius?
    • โ—Alerts on the detection of anomalies
    • ๐Ÿ“ฅViewing and deleting metrics
    • ๐Ÿ–ฒ๏ธTrack a new metric
  • Assistant
    • ๐Ÿ’Introducing Tellius Assistant
    • ๐ŸŽคVoice-based Assistant
    • ๐Ÿ’ฌInteracting with Assistant
    • โ†–๏ธSelecting Business View
  • Embedding Tellius
    • What you should know before embedding
    • Embedding URL
      • ๐Ÿ“ŠEmbedding Vizpads
        • Apply and delete filters
        • Vizpad-related actionTypes
        • Edit, save, and share a Vizpad
        • Keep, remove, drill sections
        • Adding a Viz to a Vizpad
        • Row-level policy filters
      • ๐Ÿ’กEmbedding Insights
        • Creating and Viewing Insights
      • ๐Ÿ”ŽEmbedding Search
        • Search query execution
      • Embedding Assistant
      • ๐Ÿช„Embedding Kaiya
      • Embedding Feed
  • API
    • Insights APIs
    • Search APIs
    • Authentication API (Login API)
  • โœจWhat's New
    • Release 5.4
      • Patch 5.4.0.x
    • Release 5.3
      • Patch 5.3.1
      • Patch 5.3.2
      • Patch 5.3.3
    • Release 5.2
      • Patch 5.2.1
      • Patch 5.2.2
    • Release 5.1
      • Patch 5.1.1
      • Patch 5.1.2
      • Patch 5.1.3
    • Release 5.0
      • Patch 5.0.1
      • Patch 5.0.2
      • Patch 5.0.3
      • Patch 5.0.4
      • Patch 5.0.5
    • Release 4.3 (Fall 2023)
      • Patch 4.3.1
      • Patch 4.3.2
      • Patch 4.3.3
      • Patch 4.3.4
    • Release 4.2
      • Patch 4.2.1
      • Patch 4.2.2
      • Patch 4.2.3
      • Patch 4.2.4
      • Patch 4.2.5
      • Patch 4.2.6
      • Patch 4.2.7
    • Release 4.1
      • Patch 4.1.1
      • Patch 4.1.2
      • Patch 4.1.3
      • Patch 4.1.4
      • Patch 4.1.5
    • Release 4.0
Powered by GitBook

ยฉ 2025 Tellius

On this page
  • Pick Python if:
  • Creating and applying Python code
  • Editing Python code

Was this helpful?

Export as PDF
  1. Data
  2. Preparing your datasets
  3. List of icons and their actions

Python Transform

Create, edit, and apply Python code transformations to your dataset

Python (whether PySpark or Pandas) is more flexible for applying complex business rules, iterative or row-level manipulations, or advanced text processing. You get access to Python libraries for machine learning, data wrangling, or NLP. For instance, you might import sklearn for classification or re for regex-based text cleansing. Python is ideal For:

  • Advanced data science, feature engineering, custom ML transformations, or unusual data-cleaning logic.

  • If you need loops, complex conditionals, or string manipulations that are easier to write in Python than SQL.

  • If you use PySpark, transformations can run in a distributed environment for very large datasets.

Tellius provides you to use Python option to:

  • Cleanse your data of invalid, missing, or inaccurate values

  • Modify your dataset according to your business goals and analysis

  • Enhance your dataset as needed with data from other datasets

Pick Python if:

  • You need advanced logic thatโ€™s awkward in SQLโ€”like heavy string manipulation, complex conditionals, or specialized data-science libraries.

  • Youโ€™re comfortable coding in Python and want direct access to packages (e.g., Pandas, PySpark, NumPy).

  • You have iterative or row-by-row transformations that donโ€™t translate neatly into SQL statements.

Following are some of the examples to help you get started:

def transform(dataframe):
ย  ย  # use 8 spaces for indentation
ย  ย  ย  ย resultDataframe = dataframe.where(dataframe[โ€˜Payment_Typeโ€™] == โ€˜Visaโ€™)
ย  ย  ย  ย return resultDataframe
def transform(dataframe):
ย  ย  # use 8 spaces for indentation
ย  ย  ย  ย resultDataframe = dataframe.where(dataframe[โ€˜workclassโ€™] == โ€˜Privateโ€™)
ย  ย  ย  ย return resultDataframe
def transform(dataframe):
ย  ย  # use 8 spaces for indentation
ย  ย  ย  ย resultDataframe = dataframe.withColumn(โ€˜Totalโ€™,dataframe.Qty_Sold)
ย  ย  ย  ย return resultDataframe

Creating and applying Python code

  1. Navigate Data โ†’ Prepare โ†’ Data.

  2. Select the required dataset and click on Edit.

  3. Above Data Pipeline, click on the Python option.

  1. To view the list of columns available in the selected dataset, click on Column List tab.

  1. Select the required Python framework: Pyspark or Pandas.

  • When working with datasets too large to fit into memory on a single machine.

  • If your data processing needs to be parallelized across multiple nodes for performance.

  • For processing cluster-based workloads stored in distributed environments (e.g., Hadoop, AWS S3, or large data warehouses).

  • Ideal for operations on terabytes/petabytes of data.

  • For small to medium data. When your dataset fits into memory on a single machine.

  • For quick, iterative data exploration and manipulation.

  • Simpler syntax and user-friendly APIs for data cleaning, transformation, and visualization.

  • Ideal for non-distributed workloads where performance isnโ€™t a concern.

  1. To create new code, click on Create New or Write code yourself button.

  2. Alternatively, click on Generate with Kaiya button to make Tellius Kaiya generate the required for you.

  3. Once the code is ready, click on Run Validation to validate the code. When the validation is in process, the Running Validation message is displayed.

  4. Tellius validates the entered query, and if any errors are found, they will be displayed in the bottom section of the window.

  5. If the code is correct, the validation result is shown with a Successfully Validated message at the top.

  6. After clearing the errors, click on Apply to apply the code to the dataset or click on Save in Library to save to the code library in the left pane. Or, click on Cancel to discard the code window.

From v4.2, users can apply the code to the dataset without saving it to the code library first.

Editing Python code

  1. In the Python code window, search and select the required code from the already existing Code Library.

  2. Click on Edit to modify and validate the code.

  1. Click on Run Validation to validate the code. When the validation is in process, the Running Validation message is displayed.

  2. Tellius validates the entered query, and if any errors are found, they will be displayed in the bottom section of the window.

  3. If the code is correct, the validation result is shown with a Successfully Validated message at the top.

  4. Click on Apply button to apply the Python query to the dataset.

  5. Click on Update to update the code, and click on Save as New.

The following libraries have been removed and thus cannot be imported into Python during data preparation. If any of the following libraries are imported, it will result in a Validation failed error. - shlex - sh - plumbum - pexpect - fabric - envoy - commands - os - subprocess - requests

PreviousSQL TransformNextStandard Aggregation

Last updated 4 months ago

Was this helpful?

๐Ÿ”ข
๐Ÿฃ
Data โ†’ Prepare โ†’ Data โ†’ Edit
Python window
Editing already existing Python code