What flavor of Bayes is your MMM provider using?
MARK GARRATT
Synopsis: In the last 20 years, Bayesian Analysis methods have become the preferred tool for predictive analytical models. With a few exceptions, almost every Marketing Mix Modeling (MMM) vendor says: “Yes, we use Bayes.” Under the hood, however, there are a lot of different versions of Bayes, and they produce different results. These differences are big enough to warrant changes to decisions made by media managers and trade specialists especially as media and distribution channels are becoming more fragmented. Choosing an MMM partner who uses the best form of Bayes will guarantee you more accurate analysis and help you make better decisions.
Fully Bayesian methods outperform alternatives
Many MMM vendors say they use Bayesian methods to calculate your optimal media mix. But not all Bayesian methods are created equal. When evaluating potential partners for MMM it is good to understand what really matters in Bayes to achieve actionable, reliable media optimization solutions.
Some vendors who say they are doing “Bayes” are really using a method developed in the 1960s called Empirical Bayes. This is not a fully Bayesian model. Both Empirical Bayes and a fully Bayesian model (often referred to as Hierarchical Bayes) deliver estimates of sales drivers at a detailed level. For instance, both are capable of estimating separate advertising and promotion coefficients, which get translated into sales lifts for individual DMAs, separate UPCs, and so on.
The difference is that fully Bayesian methods also estimate, understand, and gainfully use the uncertainty in those coefficients.
What it means to understand uncertainty
The next bit is technical, but important because it illustrates why understanding uncertainty matters.
Let’s say an advertiser needs to determine the impact of simultaneously investing in advertising and in-store promotion – essentially a period of feature & display. For illustration, consider our analysis of 2000 UPCs sold in supermarkets from 2019 to 2021. The charts show the impact of a feature and display (F&D) for 516 of the 2000 products that executed this tactic in that period.
In the top chart, using Empirical Bayes, we show that we estimated the average F&D coefficient at .55 and then translated it into an average sales lift of 73%. You will also note a tight distribution of coefficients around the average of .55, with the vast majority of F&D coefficients ranging from .40 to .75.
In contrast, in the bottom chart, Hierarchical Bayes gives a lift of about 28%, which corresponds to the average F&D coefficient of .26. You will also note that the Hierarchical Bayesian model shows that this effect is more dispersed over the 516 products.
How do we know which estimate is correct?
Two facts support the second analysis. First, the fit of the Hierarchical Bayesian approach is better, as measured by a smaller mean absolute % error (MAPE). But second, if we take the 516 UPCs that executed this promotion and run a regression on these products one at a time, the median value of the lift coefficients is 0.26, which is the same as the Hierarchical Bayesian result.
Why does this matter?
There is an enormous difference between the advice for action that would be delivered from the two approaches. The key recommendations that each would produce would lead marketers to decidedly different courses of action.
What advice would you get from Empirical Bayes?
“The lift from combining feature and display is 73% over base, and we are sure of that. About 80% of all the promotions fall between 57% lift and 82% lift. You should keep doing these, with great confidence in future results!”
The superior advice you would get from a Hierarchical Bayesian approach.
“The lift from combining feature and display is 28% over base. But there is a lot of variation across UPCs. About a third of UPCs have very low lifts. While we investigate further, you should consider stopping F&D with those. About half of the UPCs perform near the average of 28% lift. If it is profitable to continue these, then do so. And, notably, about twenty percent of the F&Ds have lifts averaging about 150%. Let us look into those top performers and figure out why they are doing so well, so that we can duplicate their success with other products.”
We cover the reasons for this in expansive detail in a [white paper] demonstrating the science behind Hierarchical Bayesian methods. We show how they consistently outperform other flavors of Bayes, both in the overall size of the effects that drive marketing KPIs and the richer detail they produce on the true variation of those effects across products and geographies.
What is the in4mation insights advantage?
As the marketing world becomes increasingly complex and fragmented, the need for reliable, actionable guidance for optimizing media and marketing spend will only grow.
The behavior of analytic models with intermittent and intricately mapped media and promotion variables, both online and offline, demands a very different approach than MMM models developed in the 1980s and 1990s. You should not expect these models, originally developed in a world of continuous, national-level streams, like adstocked national TV, to continue to perform well when the number of media inputs is exploding (all digital and social channels, and retail media, as examples) and their nature is drastically changing. Only a fully realized and sophisticated analytic tool like Hierarchical Bayes in the hands of a skilled MMM partner can help you understand the new world of marketing.
If you would like to discuss how Hierarchical Bayes can make a difference in optimizing your unique media and marketing mix, reach out to us for a conversation at info@in4ins.com.
Mark Garratt is a partner and co-founder of in4mation insights. He is an accomplished analytics professional with a distinguished career in both business and academia. Mark has been a trusted advisor to some of the world’s biggest brands and an analytics leader at CPG companies including P&G, SABMiller Brewing, and The Gillette Company.