Impact Calculation for Top Contributors

Radu Miclaus Updated by Radu Miclaus

The impact score provided when generating Insights in Tellius gives you an understanding of how a Change Reason Contributor influences a target variable of interest from your data.

These impact scores are calculated according to the specific Insight type and to the type of target variable of interest. The relevant types of Insights to understand are Trend Analysis and Comparison of Cohorts. The relevant types of target variables are percentages (or “rates”) and non-percentage numerical values.

Numerical, Non-Percentage Values

The most traditional target variable type is numerical, non-percentage value. This could be total revenue or sales, where the value is simply a dollar value for instance, $100,000 or $2,342.43. Or perhaps this is the total number of “sales orders,” with 10,290, 9, or 124124324 all being valid entries.

In this case, the impact score is used to measure another feature of the data (i.e. a variable or column other than the target variable) and how it influences the target variable value for that data point. For example, perhaps there is another variable, STATE, which tells us in what state orders took place. The impact calculation will determine, based on all of the data points in the dataset or business view, how a specific state influences that target value “sales orders.” 

Percentages or “Rates”

These are the target variables that are described by a percentage value. For instance, perhaps the target variable describes an “Approval Rate” (say, for auto loans). The rate would be described by a percentage, such as 75%.

Similar to non-percentage numerical values, the impact calculation will again be used to determine how variables or features other than the target variable influence the “Approval Rate” (or any percentage-based target variable specified by the user). Note that there is no restriction on the type of the other variables being evaluated, for instance these can be dimensional, such as STATE where a given state can be determined to have more impact on the target variable than another, even if the target variable is represented by a percentage value.

Trend Drivers

Performing a Trend Insight Analysis in Tellius results in top contributors or features of the data being calculated that “drive” or influence the target variable (or column) over time.

Specifically, this analysis corresponds to understanding change over two specific periods of time. Therefore, the impact calculation will result in a score for each variable that determines how that variable influenced the change in the target variable from time T1 to some time T2.

This means that the impact of the State California was calculated for each time period, and a change in influence was detected with respect to the target variable.

NOTE: this is NOT simply a difference between the target variable for that feature. In this case, the target variable is Approval Rate and the Contributor is State (a feature or other variable of the data). The change in approval rate is 4.6%, but this is different from the change in impact that the State of California had at each respective point in time. This is because the calculation must take into account the data points themselves in aggregate for a truly holistic view, to avoid risking and overly simplistic (and misleading) calculation.

For an extreme example, consider State Iowa (in the hypothetical underlying data set corresponding to the image above). State Iowa may consist of 2 data points, with 1 Approval during time period 1, and 1 data point with an approval during time period 2. The change in approval rate would appear to be:

50% ⇒ 100%

And the influence may be naively believed to be: “Iowa had a 2x impact on the overall change in approval rate”

But this is incorrect. We'd expect the impact to be greater for a state with greater population, say California with a set of datapoints 10x larger than Iowa, for instance. Our impact calculation takes this into account, and adjusts the ranking of the top contributors accordingly.

Comparison of Cohorts

Similarly, you can generate Comparison insights which compares two cohorts against a variable of interest. For instance, two cohorts can exist around a feature "Country" with values "Germany" and "Portugal."

Below is a sample calculation pertaining to a non-percentage-based numerical target variable.

Note the impact score of Drug=Pepto-Bismol on the overall total sales. Thus, the notion of the impact score calculation is applicable across the various insight type and target-column type scenarios.

Examples and variations of the Impact Calculations per type of Insights

  1. Trend Driver Impact Percentage: the change over time in the particular contributor level as a proportion of the change in the overall contributor population

Example: Sales of Office Supplies is the variable of interest

Total sales for Wk 5 2020 = $100.00 and Wk 6 2020 = $150.00.

For Office Supplies, Total sales for Wk 5 2020 = $25.00 and Wk 6 2014 = $50.00

Change percentage is ((50-25)/25)*100 = 100%

Impact percentage is ((50-25)/(150-100))*100 = 50%

Select the top contributors based on statistically significant Impact Percentages in the distribution for all contributors.

  1. Trend Driver Ratio Impact Percentage: difference in the impact of previous vs current time range calculated based on the data other than the particular contributor.

Example: Approval Rate of HomeLoan is the variable of interest

Approval Rate for Wk 5 2020 = 6.5% and Wk 6 2020 = 7.4%

For Non - Home Loans, Approval Rate for Wk 5 2020 = 45.4% and Wk 6 2014 = 49.6%

Overall Change Percentage (7.4 - 6.5)/6.5 = 13.8%

Previous Impact is ((6.5 - 45.4)/6.5) = -598%

Current Impact is ((7.4 - 49.6)/7.4) = -570%

Impact Percentage = -570 - (-598)  = +28%

Overall Approval Rate changed by 13.8% but for loans other than Home loans Approval rate changed by only 9.3% which means Home Loans had a positive impact in driving the overall approval rate even though other loans change is less.

Select the top contributors based on statistically significant Impact Percentages in the distribution for all contributors.

  1. Trend Driver for Market Share W/O Filter: Market share without a market share filter is a simple percentage.

Example: Percentage Sales of Office Supplies is the variable of interest

Total sales for Wk 5 2020 = $100.00 and Wk 6 2020 = $150.00.

For Office Supplies, Total sales for Wk 5 2020 = $25.00 and Wk 6 2020 = $50.00

Percentage of OfficeSupplies for Wk 5 2020 = 25/100 = 25%

Percentage of OfficeSupplies for Wk 6 2020 = 50/150 = 35%

Market Share Change is 35 - 25 = 10

Market Share Change Percentage is 40% 

Select the top contributors based on statistically significant Market Share Change of variable of interest within the dimension of all contributors.

  1. Trend Driver for Market Share With Filter: 

Example: Market Share of Sales of Ikea in California is the variable of interest

For California, Total sales for Wk 5 2020 = $100.00 and Wk 6 2020 = $150.00.

For California, Total Ikea sales for Wk 5 2020 = $25.00 and Wk 6 2020 = $50.00

For California, Market share of Ikea for Wk 5 2020 = 25% and Wk 6 2020 = 33%

Market Share Change is 33 - 25 = 8

Market Share Change Percentage is 32% 

Select the top contributors based on statistically significant Market Share Change of variable of interest within the dimension of all contributors.

  1. Cohort Driver Impact Percentage: the difference between two levels of a dimension as a proportion of the total overall dimension

Example:Sales of Office Supplies is the variable of interest

Total sales for Chicago = $100.00 and NYC = $150.00.

For Office Supplies, Total sales for Chicago = 25.00 and NYC = 50.00

Change percentage is ((50-25)/25)*100 = 100%

Impact percentage is ((50-25)/(150-100))*100 = 50%

Select the top contributors based on statistically significant Impact Percentages in the distribution for all contributors.

  1. Cohort Driver Ratio Impact Percentage: difference in the impact of each level of dimension calculated based on the data other than the particular contributor.

Example: Approval Rate of HomeLoan is the variable of interest

Approval Rate for Chicago = 6.5% and NYC = 7.4%

For Non - Home Loans, Approval Rate for Chicago = 45.4% and NYC = 49.6%

Overall Change Percentage (7.4 - 6.5)/6.5 = 13.8%

Chicago Impact is ((6.5 - 45.4)/6.5) = -598%

NYC Impact is ((7.4 - 49.6)/7.4) = -570%

Impact Percentage = -570 - (-598)  = +28%

Overall Approval Rate difference is 13.8% but for loans other than Home loans Approval rate difference is only 9.3% which means Home Loans have a 28% impact in Chicago compared to NYC.

Select the top contributors based on statistically significant Impact Percentages in the distribution for all contributors.

  1. Cohort Driver for Market share Wo Filter: 

Example: Market Share of Sales in California is the variable of interest

For California, Total sales = $100.00

For California, Total sales for Nike = $25.00 and Reebok = $50.00

For California, Market share of Nike = 25% and Reebok = 50%

Market Share Change is 50 - 25 = 25

Market Share Change Percentage is 50% 

Select the top contributors based on statistically significant Market Share Change of variable of interest within the dimension of all contributors.

  1. Cohort Driver for Market share Filter: 

Example: Market Share of Sales of Shoes in California is the variable of interest

For California, Total sales for Nike = $100.00 and for Reebok = $150.00

For California, Total Shoes sales for Nike = $25.00 and Shoes Sales for Reebok = $50.00

For California, Market share Shoes of Nike = 25% and Reebok = 33%

Market Share Change is 33 - 25 = 8

Market Share Change Percentage is 32% 

Select the top contributors based on statistically significant Market Share Change of variable of interest within the dimension of all contributors.

How did we do?

Marketshare

Contact