Podcast

The 2020 Aha Bowl: Factors that determine college football playoff teams

Our Callahan analytics team took a break from marketing analytics to get in the booth and discuss the upcoming college football national championship and criteria involved in determining college football playoff teams.

Callahan’s senior business intelligence analyst, James Meyerhoffer-Kubalik, reviewed over six years of data across 70 teams and compared 50 variables such as strength of schedule, point differentiation, university enrollment, total revenue and ticket sales, creating a model that can predict playoff teams.

This approach is an example of the important role data can play and how similar models can help businesses solve complex problems and drive marketing strategy.

Listen here:


(Subscribe on iTunesStitcherGoogle PlayGoogle PodcastsPocket Casts or your favorite podcast service. You can also ask Alexa or Siri to “play the Uncovering Aha! podcast.”)

Welcome to Callahan’s Uncovering Aha! podcast. We talk about a range of topics for marketing decision-makers, with a special focus on how to uncover insights in data to drive brand strategy and inspire creativity. Featuring James Meyerhoffer-Kubalik and Jan-Eric Anderson.

Jan-Eric:
Hi, I’m Jan-Eric Anderson, chief strategy officer at Callahan.

James:
I’m James Meyerhoffer-Kubalik senior business analyst at Callahan.

Danny:
I’m Danny Schuman, analytics director at Callahan.

Jan-Eric:
Danny and James, thanks for joining me today on the podcast. We’ve got a crowded booth here today. It’s going to be a fun one. We’re going to be talking about college football. The college football playoff national championship is imminent. We’ve got LSU and Clemson. So, we’re going to take a break from our routine of marketing analytics and talk about college football, and maybe some data behind the college football.

Jan-Eric:
This is a followup to a podcast that we did about this time a year ago.

James:
Yep, about a year.

Jan-Eric:
Yeah, we came in and James, you’ve done some work combining a couple of your passions. So just remind me and everybody listening, what have you done? What is it that we’re talking about today?

James:
We’re talking about identifying the criteria that the college football committee, which is a 13-person committee, trying to identify what they’re actually using as a selection criteria and trying to weight that. So that’s what we’ve really done.

Jan-Eric:
Okay. And so just to make sure everybody’s up to speed and on the same page of this thing, so we’ve got a college football season happens, going on throughout the year. At some point in the year, about halfway through the year, this college football playoff selection committee will start to reveal each week their official rankings of the teams. And the idea is that you want to be one of the top four ranked teams when the season is over because the top four ranked teams are the only four that get to go into the playoff to be eligible to win the national championship.

Jan-Eric:
So each week we get refreshed ratings or rankings and whoever’s in the top four, hypothetically speaking, at that point if the season were ending right then, they’d be in the playoff. And that’s the ultimate destination. Everybody wants to be in that top four ranking. Right?

James:
Right.

Jan-Eric:
So the selection committee announces this every Sunday. They give the updated ratings throughout the season. It’s a little bit of a black box. Right?

James:
Yeah, that’s probably the best word I’d use.

Jan-Eric:
You’re a huge college football fan, so you’re really following every week. Your guess is as good as anybody’s about what it’s going to be when it comes out each week. So part of your intention in this was really trying to shine a little bit of light into the black box, pry it open and let’s see what’s going on-

James:
Without breaking the black box is important.

Jan-Eric:
Without breaking it. Yeah. Right. So, and I think we talked about this a year ago, this was kind of a pet project for you, right? You just kind of took this on.

James:
Yeah. A lot of hours of love and data collecting about 50 variables, 6 years, 70 teams. So that’s about 20,000 cells or bits of data if anybody’s out there doing the math.

Jan-Eric:
So basically, you’re a data guy and you’re a college football guy. It brings your two big passions together into this project. So you were starting to allude to this. So tell me how does this work? You were just talking about a bunch of variables. Describe what you did and how does it work?

James:
Sure. So after the data collection part, what I did is I tagged the teams that made it to the college football playoffs with a one and those that didn’t make it with a zero. And so that’s the dependent variable. And what I did is essentially regressed it on all these other variables. They could be team specific about the individual games, the point differential, the quality of matchup and various things like this, all the way down to financial, non-related to committee decision information, like how much did they spend on recruiting, coaches’ salary and things like this. I was just trying to understand not only what the committee is evaluating, but also what’s kind of consistent on the characteristics of these teams. What’s the makeup of these teams? Do they have higher recruiting budgets and so on and so forth.

Jan-Eric:
Okay. So, essentially, let me state this back, see if I’ve got it right. So you’re looking at the teams that got in, that were among the top four at the end of each of these seasons that this has happened, right? We’ve had six years of this now?

James:
Yeah. Six years.

Jan-Eric:
So looking at six years worth of history, of those four teams that got in and then another handful of teams that didn’t get in, what… Well, you looked across a variety of variables, information, data, objective variables to be able to understand which ones were common characteristics or seem to be strong signals. Like, if a team had this type of achievement, then they were in which then, through modeling, allows you to start to understand which of these variables may be deemed more valuable than another one or more influential toward a team’s ability to get into this top four rankings?

James:
You did say that better than me.

Jan-Eric:
All right, so this is interesting because if you start to understand this, it almost becomes then a predictive model to be able to say if this is how they’ve done it for the last six years, you could be gathering that same type of information in a predictive model on say this year’s teams and then you have a better idea maybe. You have the ability to predict who’s going to get in or who’s not going to get in. Is that fair?

James:
Yeah, that’s fair. I think the key thing there too is remember that the committee is made of 13 members that have a three-year term, so it’s ever evolving and there’s such a human element of it that those things may change. Some of them may be constant but the weight of them year to year can change and especially every three years will change.

Jan-Eric:
And that would be just due to the personnel on the selection committee but also-

James:
Due to the teams.

Jan-Eric:
Due to the teams that come in, you’re going to have your model and it’s going to evolve each year. And there’s no actual official document from the selection community says, these are our weighted criteria, this is objectively how it’s done. So your attempt is actually to bring a little bit more objectivity to what is maybe somewhat of a subjective analysis.

James:
I would like to think too, also hold them accountable. I mean, if they were to use something like this. But yeah, because I know a lot of my feel is there’s biases involved and things at play. The SEC has the only conference that has two members on the board. So there’s things that play in that I want to kind of tease out.

Jan-Eric:
Right. Okay. So how many different variables were you looking at for these teams as you’re trying to figure out what are the ones that signal strong, about how many variables?

James:
It was about 50 variables I looked at.

Jan-Eric:
Okay. So give me an idea of what some of those variables are and I’m really curious like what are the variables that are really important for a team that aspires to be one of the top four? What do you got to be good at?

James:
Right. So kind of as I said before, I looked at university specific enrollment. I looked at their conference association. I looked at financial, so their total revenue, their ticket sales, all the way down the list, coaching salary and so on. But I also looked at like strength of schedule, how they experienced their worst loss, how they beat teams on average, just a bunch of those actual game variables.

Jan-Eric:
So what are some of the variables that are most important or the have the strongest signal?

James:
When I did the first model, just kind of the order, it was conference championship was number one, strength of schedule, how they experienced the largest loss. And I kind of got that from listening to pundits out there when they were talking about various teams, they’re always comparing how bad their loss was and then their-

Jan-Eric:
Sorry to interrupt you but when you talk about the experience of their loss, is that a combination of what was the margin of the point spread in their loss as well as the quality of the opponent?

James:
That’s just taken and not the quality of the opponent, just what their loss was, like incremental points.

Jan-Eric:
Gotcha. Gotcha. Okay. So they lose by one or they lose by 20 right?

James:
Right, exactly. And then in the first model, the last one that fell off this time was rights and licensing was important. So that would be like if all things elsewhere equal, they’re likely to pick the team that is more likely to generate more revenue, whether that’s TV or whatever that may be.

Jan-Eric:
That’s interesting. And that I would imagine is not being in the room, not being on the selection committee, I would imagine that might not be a conscious evaluation. It doesn’t seem like that would be relevant. However, it may be an indication of a team that has a strong fan base, is a highly-reputable good team that’s on TV, plays on national TV, frequently sells tons of merchandise because they’re well known and admired by a lot of people. So it’s not necessarily that this is a thing that the selection committee would say, “Hey, they’ve got a lot of merch they’re selling.”

James:
I don’t think they’re going to come out and say that. But so this year it kind of changed. We talked about it being ever evolving, a three-year term and whatnot. Conference champion is still number one, but then it went largest loss is second, than average point differential or how they were beating teams on average. And then the strength of schedule fell all the way to the bottom of that but still highly-correlated.

Jan-Eric:
Interesting. Okay. So I guess this is one of the big questions. You’ve done this analysis, right? And you kind of need to wait until they reveal who the top four were and that happened in early December. Right? I guess here’s the big question. Did they get it right? So we had LSU was number one and Ohio State-

James:
The Ohio State.

Jan-Eric:
The Ohio state University was two, and then we had Clemson three and the Oklahoma Sooners were four. So according to your model, looking at the historical look, how’d they do?

James:
Yeah, so, they did okay. I will say they got three of the four teams. So my model read out and I will make a case for Ohio State here if I do anything. I have Ohio State number one, LSU number two, Clemson number three, and Oregon number four. Danny gets more of a national view of college football than I do on the kind of the field. But that’s kind of how they played out. So there was one team different in Oregon, but the ranking is also important.

Danny:
So I’m curious about this because Oklahoma this year, probably more than any other year, there was very little debate about who the top four were, Oklahoma being the last team in, so to speak. But you have Oregon instead of Oklahoma, which actually technically if you look at the bowl games, which you don’t usually do, Oklahoma got trounced and then Oregon looked great.

Danny:
So I do want to hear the pre-bowl case for Oregon at least based on your data.

James:
Sure, so kind of looking at those variables as they fell out in their importance. Oregon actually had a little bit higher strength of schedule at 13, Oklahoma was 18. The largest loss for Oregon was at 6 where Oklahoma was 7. But I think the way people viewed Oklahoma’s loss was that it was a really… In the fourth quarter, it was like 43 to 28 in K state for beating Oklahoma. And I think that’s kind of how they viewed that one being at that time. But also Oregon was beating teams by about three points higher on average. So the model picked up on that.

Jan-Eric:
And so again, just to clarify, looking from a historical perspective, what the model would have indicated based on the last six years, Oregon would have been in instead of Oklahoma. But with Oklahoma going in, you then update the model. It would suggest that this year’s selection committee shifted a bit on what their emphasis was and what they favored versus where maybe historically in the past they favor. So what was more important this year than it would have been a year ago, that might’ve been what pushed OU into into the playoffs?

James:
No, that’s a really good question. So the largest lost almost doubled in importance from the first time I ran the model. And so did the average point differential or how they were beating teams on average. Those things doubled while the strength of schedule is like a fourth of what it was in importance from the first time I ran that model.

Jan-Eric:
Interesting. So the strength of your schedule became less important. And the margin of victory or margin of loss if you experienced loss, became much more important.

James:
Correct. Correct.

Jan-Eric:
Interesting. So late touchdowns by Oklahoma in a game that they lost against K State where they were getting blown out late in the game, but scored some late points actually helped them.

James:
Correct.

Jan-Eric:
And so this is fascinating because if that plays out like that, what are seemingly meaningless points in a game in late October or early November, end up having huge implications of who goes into the playoff. So I guess every snap does matter.

James:
I think it does. As an Ohio State fan, yes.

Jan-Eric:
Oh, that’s fascinating. Do you anticipate that there will be significant changes like that in rank priorities moving forward? I mean, do you think those were anomalies, at that big of a shift? I mean, do you typically see something being in a fourth of the value that it was the prior year?

James:
Yeah, I don’t typically see that shifts those big but that tells you that this committee was thinking completely different this year in how they did do their selections. So it wasn’t like any other year in the past. So yeah, but I will say this too, this one other thing we did as kind of a bonus. What we did when we got our probabilities from our legit probate model is what we did is we took the average distance between those probabilities. So the smaller those are, will tell you the strength of the matchup and the 2019th matchup is better than any other matchup in the last six years.

Jan-Eric:
So that indicates that the four teams are at parody.

James:
Yes.

Jan-Eric:
Gotcha.

James:
Besides Oklahoma.

Jan-Eric:
Yes. So that’s interesting because at this point as we’re recording this, we are in between the semifinal games and the national championship game. And the two semi-final games, one was a very competitive game, very, very entertaining game. If you don’t have a dog in the hunt, which I don’t, I realized that you’re decked out. We’re not recording this with video but you’re decked out in Ohio State Buckeye gear. They fought a great fight. It was a very good game.

James:
I appreciate you saying that.

Jan-Eric:
So that was a highly-competitive game. What’s interesting is that the other game wasn’t close at all. In fact, it was not as close as the score indicated. And how much did they lose by? What was the final score in that game? Do you guys remember? What was the spread?

James:
Let me see. I think it was a… I do have the numbers here. It’s 63 to 28.

Jan-Eric:
Yeah, and it wasn’t that close. So what’s interesting is that Oklahoma, the team that got crushed is a team that your model spit out, maybe shouldn’t have been there.

James:
Model’s cracked.

Jan-Eric:
And Oregon who wasn’t in, but the model suggests they should, in the meantime was winning the Rose bowl against a good team from Wisconsin. So that’s interesting.

Jan-Eric:
So again, what you were saying was that the difference in the overall score, the tighter the differences in those top four would suggest that you have parody and were in person entertaining football. And that certainly played out in one of the semifinal games, the game where it didn’t include the team that the model would suggest maybe didn’t belong to be there, or should not have been.

James:
But Ohio state should have been one. So we shouldn’t even have been playing Clemson. We should’ve played Oklahoma. Let the record reflect.

Jan-Eric:
So when we talk about how the selection committee did, they got three out of four teams right. And the team that they had ranked… Was Oregon ranked fifth overall in the standings?

James:
They were ranked sixth.

Jan-Eric:
Sixth. So they were knocking on the door. So they got three of the four right. But the standings weren’t necessarily right and I think their the model.

James:
That makes all the difference too.

Jan-Eric:
So the model would have suggested Ohio state one followed by-

James:
LSU.

Jan-Eric:
LSU, followed by Clemson-

James:
Followed by Oregon.

Jan-Eric:
Followed by Oregon. Interesting.

James:
It would have been a happier day for me tomorrow.

Jan-Eric:
I know, I know.

Danny:
I’m looking back because you did this model for the past, what, six years or so that they’ve had the playoffs scenario?

James:
Yeah.

Danny:
So I’m seeing that the Pac-12 is kind of universally, or at least in the years where they’re relevant, underrepresented, Oregon obviously being overlooked by the committee and then Washington a little bit in the past too. So it looks like there’s possibly an overlooking of the Pac-12. Is that a trend that you’ve seen? Is there some variable about the Pac-12 that seems to be represented stronger in the data than it is the minds of the committee?

James:
I didn’t notice anything necessarily in the data. I think more anecdotally, I know you and I have talked about when people are actually viewing the Pac-12, they’re on the West Coast when most people in these areas and on the East Coast are in bed. So there’s a good chance they may not always be evaluated the same as the other teams.

Jan-Eric:
And that could be some of that kind of blind bias. So as we’re wrapping up, I’m curious, this model doesn’t predict the winners necessarily. That would be a separate analysis and difficult to do without having more champions from a historical data standpoint to strengthen it. But just curious as we wrap up, you guys have any predictions for what’s going to happen in the national championship game?

James:
I’m going to go the fighting Joe Burrows, old OSU quarterback, LSU 48 Clemson 21.

Danny:
Yeah. I don’t think it’s going to be that strong of a victory. I do think LSU’s going to win but Clemson’s too good.

James:
Yeah. Well, we’ll see.

Jan-Eric:
I’ll tell you, I believe that the heart of the champion is something you got to take into account. And I have a hard time betting against Clemson. LSU’s got a heck of a team but I think, until somebody dethrones them, you got to stick with Clemson. So we’ve got a split jury here in the booth and that’s fine. It’s all going to play out on the field and this is fascinating. It’s great to take a break from some of the marketing analytics stuff that we talk about. This is a fascinating kind of look at the college football playoff.

Jan-Eric:
Thank you very much for coming and sharing with us.

James:
Thank you.

You’ve been listening to the Uncovering Aha Podcast. Callahan provides data savvy strategy and inspired creativity for national consumer brands. Visit us at callahan.agency to learn more.