Podcast

The Aha Bowl: The hidden factors behind College Football Playoff rankings

College football is a fun topic to talk about this time of year, and it’s a great example of the role data can play in informed decision-making and planning, and how data analysis can determine day-to-day decisions, as well as solve more complex problems.

In this episode, we talk with our senior business analyst (also our number-one college football fan) about the statistical model he built in his spare time to investigate the college football ranking system. Learn how his approach – and his findings – could be applied to a business’ own marketing and analytics efforts.

Listen here (Subscribe on iTunesStitcherGoogle PlayGoogle PodcastsPocket Casts or your favorite podcast service. You can also ask Alexa or Siri to “play the Uncovering Aha! podcast.”):

Welcome to Callahan’s Uncovering Aha! podcast. We talk about a range of topics for marketing decision-makers, with a special focus on how to uncover insights in data to drive brand strategy and inspire creativity. Featuring Jan-Eric Anderson and James Meyerhoffer-Kubalik.

Jan-Eric:
Hi, this is Jan-Eric Anderson, head of strategy at Callahan.

James:
And I’m James Meyerhoffer-Kubalik, senior business analyst at Callahan.

Jan-Eric:
So we are blessed on the podcast today to have James joining us. James works on our analytics group, and he’s a first-time guest in the podcast booth, so we’re thrilled to have him. James is here for a specific reason, ’cause we have a very timely topic we’re gonna talk about. We are here to talk about Division 1 FBS college football playoffs and the data that goes into figuring out those rankings. James, you fired up for the playoff system?

James:
Well, considering my team’s not in this year, not as fired up as I could be.

Jan-Eric:
Yeah, so in full disclosure we have to let you know that James is a die-hard fan of the Ohio State University Buckeyes.

James:
Yes, the Ohio, you got it right.

Jan-Eric:
Okay. So you need to know that as a listener, full disclosure. He’s an unbiased fan, a biased fan but an unbiased statistician.

James:
Yes, yes.

Jan-Eric:
So what we’re here to talk about today is, James has done some very interesting work in data analytics to understand what exactly goes into the ranking system. So we should probably just explain really quickly, for listeners that may not be quite as familiar with the system, the FBS Division 1 college football system is based on rankings of teams. Which is a bit of a black box ranking system, where teams are ranked one through whatever, and the top four in the rankings at the end of the regular season go into the playoff, the two-round playoff, to figure out who is the national champion for the season.

James:
Correct.

Jan-Eric:
So James has done some work to try to understand what exactly goes into the methodology to figure out who gets ranked number one versus two versus three versus four. And really, most importantly, what distinguishes someone who didn’t get in versus the people that got in? Because it’s a really big deal for teams that get in versus the teams that didn’t. So other than your fandom for college football, what sparked your interest in wanting to do this?

James:
I think it’s just that. I think I had a couple years, last year as well, my team, the Ohio State University, didn’t get in. We saw an Alabama who didn’t go to a conference championship game just jump and get in over the Big 10 championship Ohio State, and it really got my wheels turning in regards to, what is exactly going on behind scenes? As you talked about, that black box. What are they truly evaluating? Are they the guidelines that they set out that they have published, or is it something different? So it just kind of sparked a lot of curiosity, which kind of got me going into what models were out there. Looking at the models that were out there, they’re kind of inefficient. They were kind of a simulation, if you will, like here’s all your teams, 129, we’re gonna simulate every game based on pre-season information all the way to the end. So they’re very inaccurate, the margin of error continues to increase each week. And so I was gonna try to shore up those limitations of past research and just develop my own to see exactly what’s going on in that black box.

Jan-Eric:
And so the black box is managed by a selection committee of human beings.

James:
Correct.

Jan-Eric:
Who are brought in to, and their job is to serve as a board, a review committee, in essence, that publishes these rankings each week. And the one that matters the most is the one at the end of the regular season. So your mission was essentially to deconstruct and build back up what’s in their black box. You’re trying to guess, or predict, what’s in their black box based on the rankings. So as I understand it, your methodology, how you did this, was you looked at the final rankings for the past four seasons.

James:
Correct.

Jan-Eric:
And you then took a laundry list of variables that could possibly be influencing that, and tried to find matches. So talk a little bit about that process.

James:
Sure. So kind of as you said, so what I started with the base of individual teams, how they performed throughout the season, did they play top five, top 10, top 25, how they did in those matches, their margin of points if they won or they lost, all these taken into consideration. But I also took financial information into, just to see if schools that pay their coaches more, on average, or have a higher recruiting budget, if this factored into how a team did, directly or indirectly.

Jan-Eric:
So were you assuming that, or wondering if those non-football stats like recruiting budgets and things like that, were you wondering if that factored into where they ended in the rankings?

James:
I just wanted to know if it influenced, like you can just kind of take an example of an Alabama compared to a University of Kansas, or KU, as they say. They’re gonna have a higher recruiting budget, so just to kind of understand if that factors into the whole equation. Not necessarily in regards to what the committee … but if that’s also correlated as well.

Jan-Eric:
Gotcha, okay.

James:
Just to kind of see if there’s some side note things that I can kind of see along the way of the journey. ‘Cause I’m already collecting all this data as is, I might as well just throw it in there to see if it factors in or not.

Jan-Eric:
So the variables that you were looking at, things like wins and losses, that seems like a fairly obvious thing. Probably quality of the opponent or the quality of a win.

James:
Right.

Jan-Eric:
The quality of a loss, the margin of victory or the margin of loss.

James:
Right, the strength of schedule.

Jan-Eric:
Whether it was played at home or on the road.

James:
Yep.

Jan-Eric:
Those types of things. So how many different factors were you looking at?

James:
Twenty-nine different variables.

Jan-Eric:
Oh, my gosh.

James:
So I won’t try to get into how I made the sandwich, but you can only include certain ones that are not correlated with each other, because it will blow up the model. So I knew that using some finance, I would essentially have to bring it in and back it out, just to see how different ones interact and the magnitude of each. And that’s kind of where I fell on one of them, which was rights and licensing, was to identify if that had a factor in this process.

Jan-Eric:
Interesting.

James:
Yep.

Jan-Eric:
So you bring in these factors, and you’re trying to match them back to the final rankings, and in essence, your goal was to build a model …

James:
Correct.

Jan-Eric:
Of the variables that stuck. So of those 29 variables, did they all correlate to the findings, or did some fall off?

James:
No, there was actually five that correlated. So I’ll just kind of tell you what they were and if the college football playoff included those in their guidelines as well. So conference championship was the greatest one, meaning the highest magnitude, the most important.

Jan-Eric:
So if you won your conference championship game, you were very, very likely to be in the playoff.

James:
Right.

Jan-Eric:
Okay.

James:
And just to kind of, since we’re in Lawrence, Kansas, to put that in basketball terms, if you’re playing in March Madness and you win the West bracket, that’s comparable to winning the West bracket, so that’s kind of how I …

Jan-Eric:
Equivalent to going to the Final Four.

James:
Right, right. The next one was strength of schedule, so the quality of the opponent. I know you were talking about, that they played, so Alabama probably, in the SEC, has a higher strength of schedule than a KU who doesn’t play as … I don’t want to say that and offend … so they’re gonna have a higher strength of schedule that they have to go through at the end. So if they get through with a higher winning percentage, you know that they are more deserving, is kind of how to look at that one. Next I looked at how they beat teams, on average. So did Alabama beat teams by 30 points on average?

Jan-Eric:
So margin of victory.

James:
Right, margin of victory. The next one has been, it wasn’t in the guidelines, so the other two … sorry, I’ll backtrack. Strength of schedule was something that the college football playoff did mention as being important. But the other two, there’s nothing in there about the average points and how they’re beating somebody, and their largest loss. So I know that was highly debated this year, I’ll just say, how Ohio State got beat by Purdue by 29. That was one of the reasons, they called it evaluating the floor. And Ohio State’s floor was much lower than the other teams that had a loss.

Jan-Eric:
Interesting.

James:
So that was one of the reasons. So those were things that were more out in the open in the media that they were talking about, and I thought I’d evaluate those as well.

Jan-Eric:
So conference championships, strength of schedule, margin of victory.

James:
Correct.

Jan-Eric:
What else?

James:
Their largest loss.

Jan-Eric:
Gotcha, okay.

James:
So it’s looking at the incremental point of that loss.

Jan-Eric:
Gotcha.

James:
And then the final one was the rights and licensing.

Jan-Eric:
Interesting. So rights and licensing.

James:
Right.

Jan-Eric:
Which is not a football statistic. It is not an indication of the quality of the football team. That is a variable that correlates to likelihood of being in the college football playoff.

James:
Correct. So that’s kind of the way to look at that, is holding all else equal. So you have a KU versus Missouri, and everything else that I mentioned was equal. But the only thing, difference that they had, was that KU has a higher rights and licensing, on average, than Missouri, they’re gonna take KU.

Jan-Eric:
So that’s a hard example, ’cause neither one of them have been in the college football playoff.

James:
Well, I didn’t want-

Jan-Eric:
But let’s talk about something like University of Central Florida.

James:
Sure.

Jan-Eric:
So I think it was last year, they finished the season undefeated.

James:
Correct.

Jan-Eric:
And did not get into the playoff.

James:
Correct.

Jan-Eric:
According to your model, should they have been?

James:
Yes, they should have. They should have got in over Alabama by a considerable amount.

Jan-Eric:
According to data, this is not coming from the Ohio State University fan.

James:
No.

Jan-Eric:
According to the data …

James:
I would have influenced Ohio State.

Jan-Eric:
UCF should have been in.

James:
Correct.

Jan-Eric:
Instead of Alabama.

James:
Correct. And kind of the thing there is that UCF was the conference champion and Alabama wasn’t, so regardless of everything else as it fell out with those other variables, it still wasn’t enough to overcome the fact that they weren’t conference champion. So in the rankings they were actually ranked below UCF.

Jan-Eric:
That’s interesting.

James:
Yeah, and that wasn’t the first time. So my-

Jan-Eric:
Yeah, so the debate, the national debate, though, is this powerhouse Alabama, and no one’s questioning whether or not Alabama’s good. But everybody questioned whether or not UCF was worthy of being in the college football playoff. But the data doesn’t lie.

James:
Data doesn’t lie.

Jan-Eric:
They should have been there.

James:
They should have been there.

Jan-Eric:
Okay. Is that the only exception that you saw in your analysis?

James:
So in 2014, the first year of the college football playoff, my analysis showed that TCU should have got in over Oregon. And to just kind of set the background there, TCU had one loss, but it was to number three-ranked Baylor at the time, or number five-ranked Baylor, by three points, where Oregon-

Jan-Eric:
Good loss.

James:
Yeah, it was a good loss, right? But Oregon lost to an unranked Arizona … I can’t remember, it was quite a bit more, I don’t know if it was 13 or 15. But Oregon made it in over TCU.

Jan-Eric:
And were both conference champions that year?

James:
Yes.

Jan-Eric:
Interesting. That’s interesting. And Oregon clearly would have had the higher merchandising and licensing ceiling.

James:
It’s [inaudible 00:11:46].

Jan-Eric:
Yeah, that’s a no-brainer. That’s really interesting. So have you shared this information with anybody? The folks at UCF?

James:
I have, I have. I haven’t heard back, but I know right now there’s a national debate with the AD from UCF and some SEC commissioner, I don’t know the names. But I went ahead and-

Jan-Eric:
The SEC commissioner from the athletic conference.

James:
Yes.

Jan-Eric:
Yeah, not financial-

James:
Yes, I should have clarified that.

Jan-Eric:
Right. So yeah, I would imagine you sending down analysis like this might be opening old wounds at UCF.

James:
Yeah, I didn’t think about it like that. I just thought it’s a helping hand, not trying to open up past wounds.

Jan-Eric:
Right, that’s interesting. Out of curiosity, I can’t remember, how did Alabama do last year in the playoff?

James:
They won it.

Jan-Eric:
Did they win the national championship?

James:
Yeah.

Jan-Eric:
That’s interesting. What do you make of that as a statistician?

James:
You know …

Jan-Eric:
Were they worthy of being there or not?

James:
Well, the model says they weren’t, but you know, on the field they clearly were.

Jan-Eric:
Yeah, well, you know, it’s an imperfect science. And the reality is, there are many teams capable and worthy of being there, but when you gotta draw a line in the sand, it’s good to have methodology.

James:
Correct.

Jan-Eric:
And I guess kind of bringing this back, I think there are probably some parallels to this, right? So obviously this is a very timely conversation. The college football playoffs are about to start, they’re gonna start on December 29. But back to your day job and how we work together here at Callahan, just thinking about analytics in a marketing world, this is a good example … it’s a fun topic to talk about, college football, but it’s a good example of really the role that data plays. And an analysis like this can play even in our day-to-day jobs. I think that it’s a good example of how data can be used to solve complex problems. It’s not easy to figure out who should be the four that we take. ‘Cause there are a lot of worthy teams. Alabama was worthy last year, but it’s a complex problem to have to try to solve. So data can be used to help solve complex problems.

James:
Absolutely.

Jan-Eric:
Another thing would be that data can be used to ensure accountability. So when you have to have methodology to how you’re gonna make decisions and be able to justify decisions, data can be used to ensure accountability, that we’re following a set process. And I guess that tees up the third thing, is that data can be used to inform decisions. So when we have to make a decision of, go this route or go that route, when data can be used to help inform those decisions, it seems like this is a perfect metaphor for that.

James:
Right.

Jan-Eric:
Solving complex problems, ensuring accountability, and informing decisions.

James:
Absolutely.

Jan-Eric:
Well, do you have any prediction for what’s gonna happen this year? So in this year’s playoff we have Oklahoma going up against….

James:
Alabama.

Jan-Eric:
Alabama, and then we’ve got …

James:
Notre Dame versus …

Jan-Eric:
Notre Dame and Clemson in the other matchup. Any … care to predict?

James:
Yeah.

Jan-Eric:
And you have not run this year’s stats through the model, correct?

James:
Correct. So you could, in turn, use this analysis on this year, but I don’t have financial data, and that’s not published until July for each university.

Jan-Eric:
Gotcha. So it’d be a lagging analysis.

James:
Right.

Jan-Eric:
So with not having the benefit of the statistics, do you have any predictions for us before we wrap this up?

James:
I think it’s gonna be another Clemson versus Alabama, unfortunately.

Jan-Eric:
And?

James:
Ohio State.

Jan-Eric:
Oh, the ultimate politician. He won’t pick somebody. Well, we’ll have to let it play out, I guess, on the field. James, thanks for joining us today. Hopefully everyone’s enjoyed this timely topic around college football, and again, a great metaphor for how data can be used to help solve complex problems, ensure accountability, and help inform decisions. Thanks for joining us.

James:
Thank you.