“With as easy as it is to do machine learning today, as long as you have good data, there is almost no reason everyone shouldn’t be offering that level of personalization.” explains Zack Pike, head of data at Callahan. What is machine learning and how can a brand get started using its current data set?
In this podcast, Jan-Eric Anderson, president at Callahan, and Zack Pike discuss some of the basics around machine learning including:
- Definition and examples of how machines learn
- Brands currently utilizing machine learning
- How to get started
Listen here:
(Subscribe on iTunes, Stitcher, Google Play, Google Podcasts, Pocket Casts or your favorite podcast service. You can also ask Alexa or Siri to “play the Uncovering Aha! podcast.”)
Welcome to Callahan’s Uncovering Aha! podcast. We talk about a range of topics for marketing decision-makers, with a special focus on how to uncover insights in data to drive brand strategy and inspire creativity. Featuring Zack Pike and Jan-Eric Anderson.
Jan-Eric Anderson:
Hi, I’m Jan-Eric Anderson, president of Callahan.
Zack Pike:
And I am Zack Pike, head of data at Callahan.
Jan-Eric Anderson:
Welcome back to the podcast, Zack. Today, I’d like to pick your brain on a topic that’s all the rage right now. It’s a very popular term that gets thrown around and personally, I just need to understand it a little bit better. And the topic is machine learning.
Zack Pike:
Yep.
Jan-Eric Anderson:
So there’s been a lot of discussion around machine learning and I think it would be beneficial for me before we get any further into this discussion is just to … If you could help me understand what it is. Can you define for me, what is machine learning?
Zack Pike:
Yeah. So it’s hard, because machine learning is used to describe a lot of different things. And let me start with a little bit of the genesis of why it’s even here. And I think that’ll help explain what it is and kind of how it happens. So if we think back 10 years ago, data started to get so large that it was hard for humans to actually analyze all of it. Right? So if you think about Netflix, there are so many videos being watched on Netflix every day and every hour that it’s too much for a human to even comprehend. So when you put that data in front of an analyst, it’s like, gosh, there’s so much we could do with this, we could never work our way through it.
Zack Pike:
So that’s when people smarter than me started figuring out, okay, well, machines are really good at working through large amounts of data really quickly. Is there a way we can get a machine to help us with this problem? And so that’s where it all started. And really, machine learning’s objective is to make predictions. Okay. So I have a problem. I need to be able to see into the future on how to solve this problem or figure out what’s going to happen if I do X. So we can use machines to do that. So at its core, machine learning’s goal is to predict if this decision happens, what is the likely outcome.
Zack Pike:
And the way you get there is with data. So a machine learning algorithm takes a big set of data. It builds rules and labels around that data. Usually there’s a training dataset that is like, here’s your starter rule-set. And then as time goes on, the machine’s reading that actual data and using that initial rule-set to make further predictions on the data. The coolest thing about machine learning though, is that it reads the results and then it makes everything smarter behind that. So that training dataset just grows over time and makes those predictions smarter on the backend.
Jan-Eric Anderson:
Okay. So getting back to this machine learning, you defined machine learning as driving prediction?
Zack Pike:
Mm-hmm (affirmative).
Jan-Eric Anderson:
Basically it’s a computer where technology, some sort of machine and technology, that’s analyzing data to answer a question or to predict some sort of outcome if a decision were to be made. It’s predicting some sort of decision. So it’s a decision making technology that can replace humans?
Zack Pike:
Exactly.
Jan-Eric Anderson:
In labor, essentially.
Zack Pike:
Today, it’s still reliant on humans to get started. So when I mentioned that training dataset, that’s basically giving the machine initial rules to make decisions around. But from that point forward, as long as it’s done correctly and you’ve got clean data, the machine is what gets smarter on its own. That’s the learning aspect of machine learning. How about we talk about just a really simple example, I think that’ll make it a little bit more concrete.
Zack Pike:
If I have a bunch of pictures and I want to determine if there’s a cat in the picture or a dog in the picture, and that’s all I want to do. I want the machine to predict, okay, this picture has a cat. This picture has a dog. You start with, let’s say a hundred pictures, and a human has gone in and tagged each of those pictures with cat and dog. So you feed that into the algorithm. The machine now says, “Oh, okay. I can look at all these pictures and figure out the attributes inside the picture that tell me if it’s a cat or a dog.”
Zack Pike:
It could be the shape of the animal. It could be the color. It could be the way the fur comes through in the picture and then it takes that dataset. So then you start feeding it new pictures that it hasn’t seen before. The machine, the algorithm, then says, “Okay, I’m going to look at all my training data,” and say, “Okay, I have a 80% confidence that this picture is a cat.” And then it’ll give that result. And the human can then come in and say, “Oh, okay, machine, you were right,” or “You were wrong.”
Zack Pike:
And then the machine looks at that right or wrong answer from the human, and now it makes itself smarter moving forward. So it’s learning as it’s going. And you run a model like that for a year and it gets really smart on the end. Right? It gets to where-
Jan-Eric Anderson:
It no longer has to be taught, right? So it follows a similar teaching model, education model, the traditional education model as teacher and student teacher and pupil.
Zack Pike:
Yeah. Yeah.
Jan-Eric Anderson:
The same thing applies here, the pupil is a machine.
Zack Pike:
Yep. Yep, exactly.
Jan-Eric Anderson:
So to your cat and dog example, I don’t remember what age I was when I was taught the difference between a cat and a dog. But I don’t know how I was taught to know the difference between the two, other than the fact that they look different.
Zack Pike:
Right.
Jan-Eric Anderson:
So in this example, you can show a hundred pictures of a cat and a hundred pictures of the dog. Just like in a human mind, I’m looking at the shape of a cat’s face.
Zack Pike:
Mm-hmm (affirmative). Yep.
Jan-Eric Anderson:
And the shape of the ears.
Zack Pike:
Yep.
Jan-Eric Anderson:
Consistency of the shape of the ears and the nose and the space between the eyes and the length of the tail.
Zack Pike:
Yep.
Jan-Eric Anderson:
And the general body. And the body language that you get from a cat.
Zack Pike:
Yeah.
Jan-Eric Anderson:
And the way that they walk different. All of these visual cues nobody taught me that’s the difference between a cat and dog.
Zack Pike:
Mm-hmm (affirmative). Right.
Jan-Eric Anderson:
I observed this over time. Machines will make the same observations without having to go … A human doesn’t have to go in and tell the machine, “Look, they walk different” or “Look, the eyes are separated and have different separation than that in a dog.”
Zack Pike:
Yep.
Jan-Eric Anderson:
“The tail shape is consistent.” You don’t have to tell the machine that. The machine will figure that out on its own. And at some point, you can stop teaching the difference between cat and dog because the machine has learned it and probably knows more about the differences between cats and dogs than the average human would.
Zack Pike:
Yeah. Yeah.
Jan-Eric Anderson:
Is that fair?
Zack Pike:
That’s fair. And the real benefit of that is, is now once you’ve got something that’s smart enough to make those decisions, you can feed it a million pictures in an hour and it will make all those decisions for you. That would never in a million years be feasible for a single human to do. So, if you start thinking about applications of that, Google photos is a really good example. They’re using machine learning to determine who’s in your pictures, right?
Zack Pike:
So I have all of my family’s pictures sitting in Google photos, and I can search my daughter and it will pull back every picture that has my daughter in it. I didn’t go in and tag all those. Right. Google took a few photos that it knew my daughter was in. And then extrapolated that out to say, “Okay, well, she’s also in these other photos because I’ve taken the shape of her face” and all the stuff you mentioned on what we look for in photos.
Jan-Eric Anderson:
So that’s a cool example of machine learning that’s out there. Give me a few more examples that are probably widely known examples that people will be very familiar with of machine learning, but maybe haven’t thought of it as machine learning. So give me an example.
Zack Pike:
There’s a couple that probably everyone has interacted with and that’s Netflix’s recommendation engine and Amazon’s recommendation engine. Both of those are really good examples of machine learning. And everyone is doing this. If there’s a recommendation engine on a website, if it’s like, “Hey, people who are interested in this product also are typically interested in this product.” Those are examples of machine learning. Now, the interesting thing about those is they started out as just a simple set of rules.
Zack Pike:
So before machine learning, it was, “Hey, when someone watches The Office, they’re probably also interested in Shark Tank. So I, as the programmer, am going to write that rule into this little application I have running.” But over time, especially with a dataset like Amazon or Netflix, where there’s, I don’t even know what, the numbers are so large. It’s just crazy. You apply a machine to that, and now the machine can change those recommendations in real time. So if there’s a new show that comes out, that watchers of The Office tend to be really interested in, it no longer takes a person to go in and recode the recommendation engine.
Zack Pike:
The machine can say, “Oh, geez, well, they’re not as interested in Shark Tank anymore. They’re interested in this new series that just came out. So now I’m going to start making that recommendation.” YouTube is another really good example. This is how people get hooked. You start a YouTube video and you’re sitting there and it’s like, “Geez, I’ve been sitting here for an hour, just cycling through different videos.” And it’s because the machine is really good at figuring out what you’re going to watch next, based on what other people have done. If a human was trying to do that, you would need an army of people sitting there all day, every day, rewriting things to make it as smart as it is.
Jan-Eric Anderson:
Well, you’re starting to go down a path. We could take this conversation another path, which is down the, how this contributes to the idea commonly known as the echo chamber. And feeding you more of what you want to hear and how then there’s a lack of a lack of diversity of ideas and content that you might be exposed to.
Zack Pike:
Yeah. Mm-hmm (affirmative). Yep.
Jan-Eric Anderson:
Let’s not go down that path right now, although let’s reserve the right to come back to it at some point.
Zack Pike:
Yeah.
Jan-Eric Anderson:
Because I think that’s an interesting outcome.
Zack Pike:
Yeah.
Jan-Eric Anderson:
Or ramification of machine learning, but let’s get back to why machine learning is such a big deal right now. And you’ve started to talk about it, but let’s just drive right through it. The amount of data that’s being produced, hey, this is not new news, right?
Zack Pike:
Right.
Jan-Eric Anderson:
Gobs of data being created, but captured and analyzed is one thing that feeds this beast.
Zack Pike:
Yep.
Jan-Eric Anderson:
The other thing is the opportunity for personalization of experiences. So examples you were just giving of Amazon and Netflix, in my household, we have Netflix, but we have five different profiles on Netflix.
Zack Pike:
Right, yeah. Right.
Jan-Eric Anderson:
Right? And so my 14 year old son has a different profile setup than I do. And so his feed is different.
Zack Pike:
Yep.
Jan-Eric Anderson:
So in order for Netflix to be able to customize things and make that customized experience.
Zack Pike:
Mm-hmm (affirmative).
Jan-Eric Anderson:
And then you think about the millions of users that they have, and then however many profiles there are within the accounts that are established. There’s no way to scale personalization.
Zack Pike:
Right.
Jan-Eric Anderson:
So machine learning is a requirement if Netflix wanted to capitalize on a consumer expectation for personalization.
Zack Pike:
Yep.
Jan-Eric Anderson:
And the data is certainly there, right?
Zack Pike:
Yep.
Jan-Eric Anderson:
So really, if you’re wanting to move toward and wanting to create any sort of personalized experience, machine learning is … and you want to do it at scale, machine learning is really your only option.
Zack Pike:
Mm-hmm (affirmative).
Jan-Eric Anderson:
Am I right?
Zack Pike:
Yeah. And there’s some, well, let me say this. To create personalization the way it should be done, right. So I’m sure there’s a lot of people out there that have personalized website experiences that is really just like five different personas they bucket people into. A place like Netflix, your son may be getting the only version of that experience. There may not be anyone else on Netflix that’s getting his recommendations. Which is crazy to think about, right?
Jan-Eric Anderson:
Yeah.
Zack Pike:
The other side of that is, and this is actually, people have been saying this for a while. With as easy as it is to do machine learning today, as long as you have good data, there is almost no reason everyone shouldn’t be offering that level of personalization. If you have a large website and you have your customers log in and you know some stuff about those customers or users, everyone should be getting their own personalized experience. Again, Amazon is another example. When I log into Amazon, I see something totally different than what you’re seeing.
Jan-Eric Anderson:
No two Amazon screens look the same.
Zack Pike:
Right.
Jan-Eric Anderson:
And you’re right. And you add in this extra layer of complexity on an ad supported platform.
Zack Pike:
Mm-hmm (affirmative).
Jan-Eric Anderson:
Because Netflix is not ad supported, but Facebook with its billions of users worldwide, no two feeds are the same.
Zack Pike:
Yep. Right.
Jan-Eric Anderson:
And it’s not just because of different friends.
Zack Pike:
Yeah.
Jan-Eric Anderson:
But because of different sponsors coming in and different content being targeted at someone.
Zack Pike:
Yeah.
Jan-Eric Anderson:
There is no uniform broadcast method of doing things.
Zack Pike:
Mm-hmm (affirmative).
Jan-Eric Anderson:
Everything is personalized.
Zack Pike:
Right.
Jan-Eric Anderson:
So this is fascinating. So, I’m a marketer and I’m trying to figure out how do I apply this? Can you tell me … Or I see application for this in my line of business, for sure. What are the requirements for me to get started on this? What does it take? I mean, I don’t know how to make an algorithm.
Zack Pike:
Right.
Jan-Eric Anderson:
Where do I start?
Zack Pike:
Right. So first, let me back up just a little bit. What I see a lot is people coming in saying, “I want to do machine learning.” And that shouldn’t be the place we start. Right? It’s what you said, which is “I see application for this in my business.” So we need to find problems that can benefit from having better predictions around them. Recommendations to customers, churn modeling, forecasting. Stuff like that where if I was smarter about the result, I could make this whole problem either go away or better or drive a better customer experience. So that’s the first step.
Zack Pike:
After that, the question is, do I have the data to make an accurate prediction? Usually it requires quite a bit. If you’re running a really small website with not very many users, machine learning probably isn’t necessary. It’s when you have large amounts of data, that it’s hard for a human to really comprehend where this stuff gets valuable. So having the data, but then the next step is having clean data. So what often happens is we get data on whatever we’re trying to do. So let’s talk about an example.
Zack Pike:
Let’s say, I’m trying to predict my customers who are going to have the highest lifetime value. Because I want to nurture those customers differently than I nurture other customers. If you’ve got a website with millions of customers, it’s hard for an analyst to build something around that. So that’s a really good use of machine learning. You will have the data you need to build a machine learning algorithm against, what will probably be a problem is it won’t be clean. And what I mean by clean is you’ll have a lot of just messiness in that data of user behavior that is never going to inform if they’re going to be a high lifetime value customer.
Zack Pike:
So that’s where good data science people come into play, good data engineering people. This is not something where you’re just taking a big dataset and pushing it into a machine and it’s doing everything. If you’re going to do this right, you’re going to need at least a really smart data scientist who has some experience in machine learning. And then probably some form of data engineer who can help get that data cleaned, prepped, and build a pipeline for it to flow automatically. Because you don’t want to just do this once you want to do it ongoing and feed it into that model.
Zack Pike:
So those are kind of the table stakes to really start getting into this. Now, a place like Facebook, Netflix, they have probably hundreds of data scientists there that are working on this type of stuff. So once you solve those problems, oddly enough, the actual act of running the machine learning model is getting easier every day. It used to be the data scientists would need to write their own model. They’d have a lot of work in figuring out all the pitfalls inside of it and everything. Today, places like Google are offering prebuilt models that you can feed your data into. It’s already a model Google has proven out.
Zack Pike:
Their data scientists have tested it and found all the problems inside of it. And you know what it’s going to do. You can feed your data into there, let it run, and it will spit back your predictions for you. So, we’ve done some work on some machine learning projects here. And 90% of our work was in getting the data ready. It wasn’t actually in running the model because we’re a Google cloud customer we use a lot of their prebuilt algorithms. So that’s kind of the way to go about this problem.
Jan-Eric Anderson:
I really appreciate all your perspective on this as we’re wrapping up. But it seems like machine learning is not going anywhere.
Zack Pike:
Yeah.
Jan-Eric Anderson:
And if anything, it’s going to continue to evolve and become more sophisticated. We’ve had conversations about machine learning being kind of single tasked and single focused on making one set of prediction. Likely what’s to come is the advancement of machine learning to be able to make multiple types of decisions or take in a lot of other types of variables.
Zack Pike:
Yeah.
Jan-Eric Anderson:
And it becomes more complex and more applicable then, more broadly speaking. So it’s not going anywhere because the creation of data isn’t going anywhere. The need for an opportunity around personalization and customization for different experiences is not going anywhere. This is something that’s here to stay. And so I really appreciate your perspective on this because it’s a topic that a lot of people probably need to figure out if they haven’t already gotten going on this.
Zack Pike:
Yeah.
Jan-Eric Anderson:
And it’s applicable, probably in most corners of the world, it’s applicable.
Zack Pike:
Yeah.
Jan-Eric Anderson:
It’s just a question of how do we get going on it. So thanks for your input on this. And tell you what, let’s make a date to come back and talk about some of the other implications that come along with machine learning.
Zack Pike:
Yeah.
Jan-Eric Anderson:
And creating things like the echo chamber that we were talking about. I think that’d be another fascinating conversation to have. So any other parting thoughts before we wrap up?
Zack Pike:
The only thing I’ll say is that machine learning often is like this unicorn thing. People talk about it like it’s solving all of the problems and it’s not. So we have to take it with a grain of salt. Machine learning is used everywhere, not just in marketing and business. It’s used engineering and construction. And one interesting aspect is the financial markets. So the financial markets is a really good example of where machine learning models isn’t so exciting. It’s not yet to a point where machine learning makes better decisions than humans, as far as picking stocks that are going to grow.
Zack Pike:
So if you take all the machine learning models that have been written in the stock market and all of the financial, the people who have made decisions, looking at hedge funds, they perform the same. So there’s no benefit one way or the other yet. And there’s a lot of randomness in there that drives that. But just keep that in mind. It’s not a total silver bullet yet, depending on the application that we’re going after.
Jan-Eric Anderson:
That’s a great point. Hey, thanks for your knowledge and perspective on this, it’s very helpful.
You’ve been listening to the Uncovering Aha! podcast. Callahan provides data savvy strategy and inspired creativity for national consumer brands. Visit us at callahan.agency to learn more.