Predictive modeling: a look under the hood

Can past business performance measures predict future performance? If so, are these measures considered endogenous (internal) or exogenous (external), and how efficiently can they model performance that’s yet to come?

Before we get too deep into the weeds, you may be asking:

  • What is predictive modeling/forecasting?
  • How and why should I use a forecast?
  • Where can I learn more about the value of a forecast?

This article will answer these and other questions, with the caveat that it’s going to get technical. For the modeling/forecasting enthusiasts out there, I’ve included some extra technical jargon—find it by reading the italicized copy that starts with “More technical stuff (can skip…)”.

Predictive modeling/forecasting is a process that uses data mining and statistical analysis to identify predictors (i.e., variables) that are highly-correlated with some specific outcomes we’d like to forecast, e.g., traffic, sales revenue, etc.

A majority of forecasts are vector autoregressive (VAR), which is just a fancy way of saying that they use a time-series linear regression with lagged (i.e., past) performance measure values to help create a future forecast. Which is still a pretty fancy way of saying that VAR models use past performance to predict future outcomes.

VAR models prove their worth in that lagged values are extremely predictive of future performance. This is likely because VAR models already include the idiosyncrasies of business trends. Things like seasonality, business cycles, trends, etc. All of this is great because it means we don’t need to have additional knowledge of the forces that are influencing the business trends.

More technical stuff (can skip….)

A word to the statistics geeks out there: You’ll want to use scenario analysis to estimate the appropriate lag lengths, and you’ll want to do a Granger Causality, impulse response and a non-stationarity test.

Another forecast, a standard ordinary-least square (OLS) linear regression (just rolls right off the tongue, don’t you think?), involves data mining and statistical analysis to determine the variables that influence business trends. These variables are typically external and vast, and may be marketing-, economic- and/or weather-oriented.

A marketing-based forecast might evaluate Facebook attributes, television impressions, display impressions and radio impressions to determine which marketing channel was the most effective in driving sales.

There are other types of linear and nonlinear data models, however, that may be better suited to meet your specific forecast needs. For example, if you’re going to forecast energy consumption, an exponential growth model is your best bet.

More technical stuff (can skip….)

Before you finalize your data model, it’s important to ensure it’s optimized by performing appropriate diagnostics—heteroskedasticity, autocorrelation, normality, etc.—to confirm your forecast errors are white noise (i.e., randomly oscillating with a zero mean).

Forecasts should be used to understand, with all else held constant in respect to the past (ceteris paribus), what the future is expected to look like. Many CMOs get in the habit of defining success by comparing growth, year-over-year, at a daily, weekly, or yearly-level, but these measures might be deceiving.

For example, after analyzing four years of weekly trends, our forecast might tell us that weekly sales for the next year are projected to be down by an average of 6% from the previous year, based on the statistical analysis.

Who wants to explain to the CFO that our marketing campaign wasn’t effective because our year-over-year sales comparison showed a loss of 9%? I’ll tell you who. No one.

Now, with that forecast in our corner, and its average weekly projection at minus 6%, we can show that CFO how we’re actually up by 3%, and that our marketing campaign helped reverse a downward trend.

Do yourself a favor and take 5 minutes to read this case study. It illustrates a practical application of this exact hypothetical:

Case study: Using data modeling to predict marketing’s effect on sales

At Callahan, we leverage our Intelligence Platform (and years of statistical modeling and analytical expertise) to evaluate our client’s business trends and custom build data models that predict future business performance. Our data models are scalable, from an individual store-level analysis to a system-wide forecast, and can be tailored to specific client requirements, i.e., types of stores, regions, states, buying groups, etc. We share the results with our clients—in real time—via an online, dynamic dashboard that’s accessible 24/7.

Contact us to learn more and set up a demo of our Intelligence Platform.