How much memory does your data use in the ICE?
Plan for the ICE in-memory cache to be roughly half of your uncompressed CSV size, and add headroom for transforms, AutoML, Insights, and refreshes.
What “data size” means in our guidelines
For capacity planning, use the size of your data as uncompressed CSV as the baseline. This gives a consistent, easy-to-compare reference across sources such as Snowflake, Redshift, Parquet, or compressed files.
Dimension environments from the uncompressed CSV size, not the storage-optimized size reported by source systems.
Hot vs. cold compression: different goals
The ICE keeps frequently accessed data hot for interactive analytics. Data is stored in a vectorized columnar format and compressed with fast in-memory compression to deliver low-latency queries.
Cold storage (for example, archived files on disk) often uses heavier compression to minimize size at rest. That approach optimizes for minimum disk, not interactive speed. The ICE optimizes for responsiveness, which is what users experience in Tellius.
A practical rule of thumb for in-memory footprint
Plan for the ICE in-memory cache to use about 45%–60% of the uncompressed CSV size. A typical midpoint for many analytic datasets is ~50% of CSV. This is an estimate; actual results vary by schema and data characteristics.
Example
CSV baseline: 200 GB
Expected ICE in-memory footprint: ~90–120 GB
A common outcome around ~100 GB is normal and within expectations.
Why footprints vary
Actual memory use depends on the shape of your data.
Often smaller footprints:
Integer columns with small ranges.
Low-cardinality categorical strings (codes or enums).
Columns with many repeated values or nulls.
Often larger footprints:
High-cardinality free-text strings and UUIDs.
Many variable-length string columns.
Very high-precision numeric columns with mostly unique values.
Plan headroom for processing
Interactive analytics is one part of the picture. Transformations (ETL), AutoML, Insights, and refresh operations use additional working memory. During a full (non-incremental) refresh, Tellius may temporarily hold old and new copies while transformations apply. For reliable operations, provision extra headroom above the steady-state cache estimate.
How this relates to the Fast Query Engine (FQE)
The FQE stores denormalized Business Views on disk for fast query latency. Disk compression there is effective, and the storage layout is optimized for query speed. Because Business Views are denormalized for performance, their on-disk size is not directly comparable to the CSV baseline or to the in-memory ICE cache. Each layer is tuned for its purpose.
FAQs
Is it normal that my dataset looks larger in ICE than it does in compressed storage? Yes. Compressed storage optimizes for minimum size at rest. ICE optimizes for fast, interactive analytics, which requires a different balance between speed and compactness.
Can I get an exact footprint before loading? The best pre-load estimate is CSV size × 0.45–0.60. After first load, you can confirm the precise footprint for your data.
Do added columns change the estimate? Yes. New columns, especially high-cardinality text, increase the in-memory footprint. Factor this in when planning growth.
Last updated
Was this helpful?