How to include/exclude columns?
Step-by-step guide to include/exclude columns when creating Insights
Last updated
Was this helpful?
Step-by-step guide to include/exclude columns when creating Insights
Last updated
Was this helpful?
In the Column Selection page (which is the third step while configuring Insights), all the available columns will be displayed under "Included" section. A few columns which were automatically excluded will be displayed under "Excluded" section. Any columns can be moved between these sections as required.
The number of columns in the Included section will be displayed.
You can search for any required column using the search bar.
Click on Select All if you want to select all the columns in the Included section.
In the Included section, select the columns you want to exclude. As you select, you can find the number of columns selected highlighted at the bottom.
Click on the "->" button to move the selected columns to the Excluded section.
As soon as you move the columns, the following prompt message will be displayed at the bottom.
If you want to move columns from the Excluded to Included section, select the required columns and click on the "<-" button.
If you hover over any individual column, the arrows "->" and "<-" will appear, which can be used to move the columns accordingly.
The number of columns existing in the Excluded section will be displayed.
The columns excluded will be displayed under the category "Excluded by user during Insight creation".
After including all the required columns, click on Create to generate Insight or click on Back to go to the previous step.
Once you click on Create, the Insight generation will begin and the following screen will be displayed.
The Insight will run in the background, and its status can be checked under Notifications -> Insights tab.
In the Excluded section, Tellius or users may exclude certain columns for a variety of reasons, categorized for clarity and transparency. Here's the list of all possible categories and the reason for their exclusion:
Excluded since it’s chosen as target column
Columns designated as the target (the outcome or dependent variable that Insight is trying to explain or predict) are excluded from the independent variables to prevent circular reasoning.
Example: In a sales analysis where "Monthly Revenue" is the target, this column is excluded to avoid using revenue to predict revenue.
Excluded since it’s chosen as cohort column
A cohort column categorizes data into groups for comparison. If a column is used to define a cohort, it is excluded from the drivers to focus the analysis on how other variables affect the cohorts, not on the cohort-defining variables themselves.
Example: If "Subscription Type" defines customer cohorts for churn analysis, it’s excluded as a driver to focus on what affects churn within each subscription type.
Excluded by user during Insight creation
These are columns that a user chooses to exclude while setting up the Insight. The user might exclude columns they know to be irrelevant or redundant to ensure the analysis is focused only on necessary data.
Excluded by user after Insight creation
Columns may be excluded after the Insight results are displayed if they are not contributing meaningfully to the analysis, or to refine the model based on initial results.
Excluded since the limit of columns for Live Insights exceeded
By default, Tellius selects 15 columns in the Included section for Live Insights. If the columns are ranked in the data preparation phase, the ranked columns will be prioritised. The other columns will be displayed under "Excluded since the limit of columns for Live Insights exceeded" in the Excluded section.
In addition, users can add five more columns to the Included section. So, the overall limit of columns in the Included section is 20.
Excluded due to data type
Columns with data types unsuitable for the analysis (e.g., binary data in a regression model) are excluded to prevent errors or irrelevant results.
Example: "Profile Picture" is excluded from a user engagement analysis as its image data type is not analyzable for this purpose.
Excluded due to cardinality
Columns with a high number of unique values (high cardinality) are excluded to prevent issues such as overfitting and to enhance model performance and interpretability.
Example: The column "Employee ID" is excluded since it's a unique identifier.
Excluded due to correlation with other column(s)
To avoid multicollinearity, which can skew the analysis, columns that are highly correlated with others are excluded to ensure each independent variable contributes unique information.
Example: "Total Sales" is excluded because it’s highly correlated with "Number of Transactions," which is already included
Excluded based on feature importance
During model training, some features may be deemed less important. These are excluded to streamline the model and focus on the most impactful variables.
Example: "Page Color Theme" is excluded from website conversion analysis as it has low feature importance compared to "Page Load Time."
Excluded by user during data preparation
This happens in the data prep phase when a user decides to exclude columns before the insight creation process begins.