Monitoring usage and system resources

Provides operational SQL queries and data extraction instructions

This guide helps Tellius admins track system usage, monitor dataset growth, audit user activity, and optimize platform performance. By running the provided SQL queries via a secure Postgres data source, you can create reusable datasets and dashboards that offer visibility into how Tellius is being used across your organization.

These queries require access to the system metadata database. Contact your Tellius admin or support team to set up a secure, validated Postgres connection.

Connecting to the metadata database

To run the queries in this guide, you’ll need to create a Postgres data source that connects to your Tellius instance’s internal metadata store.

  1. Under Data → Connect, choose Postgres as the connection type.

  2. Provide the host, port, database name, username, and password (your Tellius admin or support team will supply these).

  3. Save and test the connection.

  4. Once connected, you can use the Custom SQL option in the dataset creation flow to execute the queries below.

Setting up the metadata connection

To run system-level queries (like usage tracking, dataset stats, user activity, etc.), you'll need to connect to Tellius internal metadata databases using a Postgres data source.

Databases you may connect to:

  • middleware_prod – Stores core platform metadata such as users, datasets, Business Views, and configurations.

  • middleware_tracking_prod – (Optional) May contain event tracking or usage-related metadata depending on deployment.

  • usage_tracker – Logs detailed activity, such as search queries, refresh events, Vizpad usage, and user logins.

Metadata schema overview

Once connected, you can query internal tables that define user access, roles, and group mappings in your Tellius environment.

Table
Description

users

Stores all registered users and their metadata (e.g., name, email, roles)

groups

Represents logical user groups used for access control

users_groups

Maps which users belong to which groups

Monitoring datasets

This query lists each dataset in the system with key metadata such as name, owner, number of rows and columns, and size (in kilobytes).

SELECT 
  dm."datasetName" AS DatasetName,
  us.username AS OwnerUsername,
  ((dm.current->'size')::jsonb::text::bigint)/1024 AS SizeKB,
  jsonb_array_length(dm.current->'columns') AS NoOfColumns,
  (dm.current->'numberOfRows')::jsonb::text::bigint AS NoOfRows,
  dm.inserted_at AS Created,
  dm.updated_at AS Updated
FROM dataset_metadata dm
JOIN users us ON dm.owner_id = us.id;

Use this to identify:

  • Large or old datasets that may be archived or refreshed

  • Ownership and accountability of data assets

  • Data growth trends over time

Dataset size estimation in GB

This more advanced query estimates total dataset size by calculating per-row memory cost based on data types, then multiplying by row count.

-- Estimates dataset size in bytes and gigabytes.
-- Identifies if the dataset is live (in-database) or in-memory.

SELECT *, 
       (row_count * row_size) AS total_size_bytes,
       (row_count * row_size)/1024.0/1024.0/1024.0 AS total_size_gb
FROM (
  SELECT 
    dm."datasetId" AS id,
    dm."datasetName" AS name,
    us.username AS owner,
    SUM(CASE
      WHEN type = 'boolean' THEN 1
      WHEN type = 'short' THEN 1
      WHEN type = 'integer' THEN 2
      WHEN type = 'float' THEN 4
      WHEN type = 'long' THEN 8
      WHEN type = 'double' THEN 8
      WHEN type = 'date' THEN 12
      WHEN type = 'timestamp' THEN 15
      WHEN type = 'string' THEN 15
      ELSE 0
    END) AS row_size,
    ((dm.current->>'numberOfRows')::bigint) AS row_count,
    CASE WHEN "additionalParams"->>'indb' = 'true' THEN 'live' ELSE 'in-memory' END AS dataset_type
  FROM dataset_metadata dm
  JOIN users us ON dm.owner_id = us.id,
       LATERAL jsonb_array_elements_text(current->'columns') AS json_value,
       LATERAL (SELECT json_value::jsonb->'typeStats'->>'mainType' AS type) AS types
  GROUP BY dm."datasetId", dm."datasetName", us.username, dm.current, "additionalParams"
) a;

User activity monitoring

Lists Tellius users along with their email addresses and names. Use this to track platform adoption, audit access, or manage licenses.

SELECT 
  username AS "UserName",
  email AS "Email",
  "firstName" AS "FirstName",
  "lastName" AS "LastName"
FROM users;

Monitor usage of natural language search by tracking what queries users submit and when. Combine with user logs to understand adoption of search and identify training opportunities.

SELECT 
  query,
  time,
  username
FROM query_history q
JOIN users u ON q.userid = u.id;

Daily usage analytics

This query aggregates daily activity across various Tellius features such as Search, Insights, data operations, Vizpads, and user logins. Visualize this in a line chart to observe daily or weekly usage patterns.

SELECT 
  DATE_TRUNC('day', action_at) AS day,
  COUNT(CASE WHEN resource_type = 'insight' THEN 1 END) AS insight_operations,
  COUNT(CASE WHEN resource_type IN ('dataset', 'datasource', 'businessView') THEN 1 END) AS data_operations,
  COUNT(CASE WHEN resource_type = 'search_query' THEN 1 END) AS search_queries,
  COUNT(CASE WHEN resource_type = 'model' THEN 1 END) AS model_operations,
  COUNT(CASE WHEN resource_type = 'vizpad' AND operation_type = 'created' THEN 1 END) AS vizpad_created,
  COUNT(CASE WHEN resource_type = 'vizpad' AND operation_type = 'viewed' THEN 1 END) AS vizpad_viewed,
  COUNT(CASE WHEN resource_type = 'user' AND operation_type = 'login_success' THEN 1 END) AS user_logins,
  COUNT(DISTINCT user_id) AS unique_users
FROM usage_tracking
WHERE action_at >= '2023-01-01'
GROUP BY 1
ORDER BY 1;

Kaiya conversational AI usage

Track how often users engage with Kaiya

SELECT 
  DATE_TRUNC('day', inserted_at) AS day,
  COUNT(DISTINCT conversation_id) AS conversations,
  COUNT(CASE WHEN sender = 'bot' AND LOWER(status) != 'failure' THEN 1 END) AS kaiya_responses
FROM messages
WHERE inserted_at >= '2023-01-01'
GROUP BY 1
ORDER BY 1;

Monitoring Business View

Monitor how large each Business View is to optimize performance.

SELECT 
  id,
  name,
  metadata ->> 'length' AS bv_size,
  us.username AS owner_username
FROM business_views bvs
JOIN users us ON bvs.owner_id = us.id;

Understand which Business Views are shared with which groups. Use this to audit access control and sharing practices.

SELECT 
  business_view_id,
  "sharedWithKind",
  "objectKind",
  gps.name AS group_name
FROM sharings shs
JOIN groups gps ON shs.shared_with_group_id = gps.id
WHERE "objectKind" = 'businessView' AND "sharedWithKind" = 'group';

Business View ↔ Vizpad mapping

This helps you understand which Vizpads rely on which Business Views, and how many charts, KPIs, etc. each one contains. This helps you to analyze impact before modifying or deleting a BusinessView

  • Design optimization (e.g., identifying Vizpads using large or outdated views)

  • Auditing dependencies for dashboard maintenance

SELECT 
  bv.name AS businessview_name,
  vp.title AS vizpad_title,
  COALESCE(vc.viz_count, 0) AS viz_count,
  bv.id AS business_view_id,
  vp.id AS vizpad_id
FROM vizpads vp
JOIN business_views bv 
  ON (bv.id = ANY(vp.business_view_ids) OR bv.id = vp.business_view_id)
LEFT JOIN (
  SELECT vt.vizpad_id, COUNT(*) AS viz_count
  FROM viz_tab vt 
  JOIN viz_item vi ON vt.id = vi.tab_id
  GROUP BY vt.vizpad_id
) vc ON vp.id = vc.vizpad_id;

To merge Business View size and sharing details for offline audit or reporting, use the provided Python script to join exported .csv files and calculate sizes in GB. Useful for access control reviews or cleaning up unused assets.

Snowflake data source metadata

Track how Snowflake is used in your environment, including:

  • How many datasets are built on each connection

  • Who owns the connections

  • Which auth methods are in use

SELECT 
  ds."datasourceId" AS datasource_id,
  ds.name AS name,
  ds.inserted_at AS creation_time,
  MAX(us.email) AS owner_email,
  COUNT(*) AS no_of_datasets_loaded,
  options->>'sfURL' AS url,
  options->>'sfUser' AS user,
  options->>'sfSchema' AS schema,
  options->>'sfDatabase' AS database,
  options->>'sfRole' AS role,
  options->>'auth' AS auth_method
FROM datasources ds
LEFT JOIN dataset_metadata dm ON ds."datasourceId" = dm.datasource_id
JOIN users us ON ds.owner_id = us.id
WHERE "sourceType" = 'snowflake'
GROUP BY ds."datasourceId";

Scheduled refresh monitoring

Monitor refresh jobs across datasets, models, Vizpads, and Insights.

SELECT 
  sc.id,
  sc.name,
  sc.frequency,
  CASE sc.frequencyunit
    WHEN 'm' THEN 'Minutes'
    WHEN 'h' THEN 'Hours'
    WHEN 'd' THEN 'Days'
    WHEN 'w' THEN 'Weeks'
    WHEN 'M' THEN 'Months'
    ELSE sc.frequencyunit
  END AS frequency_unit,
  sc.object->>'objectType' AS object_type,
  COALESCE(vp.title, dm."datasetName", mod."modelId", ds.name, ins.name) AS object_name,
  us.username AS owner_name
FROM schedules sc
JOIN users us ON sc.owner_id = us.id
LEFT JOIN vizpads vp ON vp.id = sc.vizpad_id
LEFT JOIN dataset_metadata dm ON dm."datasetId" = sc.dataset_id
LEFT JOIN models mod ON mod."modelId" = sc.model_id
LEFT JOIN datasources ds ON ds."datasourceId" = sc.datasource_id
LEFT JOIN insights ins ON ins."insightId" = sc.insight_id;

Estimating cluster resource handling capacity

The amount of data Tellius cluster can handle depends on two factors:

  1. Whether cluster autoscaling is enabled or disabled

  2. The number of allocated CPU cores in the cluster

The following formulas help estimate how much data your cluster can process and store efficiently.

  • numberOfCores: Total number of CPU cores in your Tellius cluster.

  • 3.125: This is the baseline GB-to-core multiplier.

🚫 Cluster autoscaling disabled

When autoscaling is turned off, the system cannot automatically add or remove resources based on workload. You must size the cluster manually to meet expected demand.

Ideal data capacity:

Ideal Capacity (GB) = 3.125 × 1.20 × numberOfCores

The 1.20 factor accounts for additional memory overhead used by background operations and system processes.

For maximum supported data with degradation:

Max Capacity (GB) = 3.125 × 1.20 × 1.25 × numberOfCores

The 1.25 multiplier allows for up to 25% more data to be handled, though performance (e.g., response time or refresh speed) may slightly degrade.

For a cluster with 64 cores:

  • Ideal capacity = 3.125 × 1.20 × 64 = 240 GB

  • Max capacity (with some degradation) = 3.125 × 1.20 × 1.25 × 64 = 300 GB

✅ Cluster autoscaling enabled

When autoscaling is enabled, the cluster can dynamically adjust resources based on demand. In this setup, system overhead is already managed, so the formula is simpler.

Ideal data capacity:

Ideal Capacity (GB) = 3.125 × numberOfCores

For maximum supported data with degradation:

Max Capacity (GB) = 3.125 × 1.25 × numberOfCores

These values are estimates, not strict limits. Actual performance may vary based on:

  • Data format (CSV vs. Parquet)

  • Number of columns

  • Query complexity

  • Number of concurrent users

Cluster sizing reference table

Here’s a simple comparison to help you estimate how much data your cluster can handle, depending on whether autoscaling is enabled or not:

# Cores
Autoscaling
Ideal capacity
Max capacity (with 25% buffer)

32

Enabled

100 GB

125 GB

32

Disabled

120 GB

150 GB

64

Enabled

200 GB

250 GB

64

Disabled

240 GB

300 GB

128

Enabled

400 GB

500 GB

128

Disabled

480 GB

600 GB

Calculated columns

This query shows all custom formulas used in Business Views.

SELECT DISTINCT "formulaType", formula 
FROM business_views_columns 
WHERE formula IS NOT NULL;

Last updated

Was this helpful?