The Numbers Game - How Data Is Changing Football I FourFourTwo Documentary

  • 8 months ago
You've spent a small fortune on the squad, got state-of-the-art facilities and employ a dietician and a psychologist - but today you'll win nothing without an analytics team to crunch the numbers from every aspect of your players' performances. We do the maths so you don't have to...
Transcript
00:00 [Chanting]
00:07 Clubs at every level of the football pyramid are becoming smarter and more efficient.
00:13 How? The use of data.
00:16 To hell with conventional wisdom. The way we've been doing it, it's not been working.
00:21 Analysts are now recording data from thousands of actions, during games and training sessions,
00:26 to help shape pre-match preparation and post-game debriefs,
00:30 pinpoint transfer targets and develop young talent.
00:33 The genie is out of the bottle, I think it's going back in.
00:35 We may know more about the opposition than they actually know about themselves.
00:39 The growing use of analytics in football has attracted criticism and cynicism.
00:44 These are athletes, they're not spreadsheets.
00:47 Battle lines have been drawn between the analysts and the traditionalists.
00:53 Can football be translated into numbers by data bots?
00:56 Or does it require special insight from real football men?
01:00 In 2002, one of the most unfashionable teams in Major League Baseball, the Auckland Athletics,
01:17 defied the odds to go on a record-breaking 20-game winning streak.
01:22 Their success was powered by a new approach to player recruitment, Saber Metrics.
01:28 Well it started, I'd played for 10 years professionally,
01:31 and so when I stopped playing I entered the front office and I started reading this stuff.
01:35 Again, the baseball academics, it made sense to me.
01:38 And I had my own experience with which to look at both sides.
01:41 I came from a traditional baseball background as a player,
01:44 and I was reading this new stuff that sort of put player performance in order for me.
01:49 It was very rational.
01:52 I could see why a baseball team was good.
01:55 You could look at numbers and explain why they were good,
01:57 instead of sort of looking at things anecdotally
01:59 and trying to use non-quantifiable reasons to apply success.
02:05 We were one of the smallest teams in the league, we were actually losing money.
02:10 But it also created a great platform.
02:13 It meant that if we just did things the same way the New York Yankees,
02:17 aka the Manchester United did,
02:19 we were destined to finish where our player wages said we should.
02:23 If you had the lowest payroll, you were probably going to finish in last.
02:27 So we had the opportunity, because we had nothing to lose,
02:30 to implement something differently.
02:32 The success of the Auckland days encouraged sports teams around the world
02:35 to replicate the model pioneered by Billy Bean.
02:38 Early adopters believed the moneyball approach
02:41 could give them an advantage over their competitors.
02:44 We knew it worked on individual players, and we were able to apply it to the whole team.
02:49 We won four division titles, three division titles in a wildcard,
02:53 and averaged almost 96, 97 wins per year.
02:57 So we had immediate success.
03:00 But the biggest thing, the most important thing,
03:02 is we understood why we were successful,
03:05 and we understood where we went wrong.
03:07 I mean, the numbers would show us.
03:09 Billy Bean had the huge luxury of not looking at relegation.
03:13 If you don't have to look at relegation, you can try all kinds of stuff.
03:16 Analytics and big data are driving the strategies of major corporations around the world.
03:21 And these methods are now filtering into football,
03:24 from the boardroom to the boot room.
03:26 Football clubs over the last 10, 15 years
03:29 have had to deal with a technological revolution.
03:32 What that's meant is they've now started to collect,
03:35 through third-party vendors, lots and lots of data on football.
03:39 And those data, primarily, originally were collected for fans and for media outlets to use.
03:44 They've made their way into the clubs themselves,
03:47 and now you have football departments that have to contend
03:50 with what's kind of an avalanche of information.
03:53 Sports data is basically a reconstruction of the match.
03:56 Okay, so why do we collect data?
03:58 It's basically so we can tell a story of how the match is played.
04:01 And so you can look through it in various lenses.
04:04 So you could have just event data, and how many passes and shots.
04:07 But as we know, football, it's not a great reconstruction.
04:10 But if we had the tracking data, so if you can see the dots run around,
04:14 we can basically reconstruct the game in a better way.
04:17 It's like having a scout at every game.
04:22 And not just having a scout at every game,
04:24 because we're collecting data on everything that the player is doing on the field.
04:28 It's like having a scout for every player in every game,
04:31 because everything they do is recorded.
04:33 Now it's not so much about collecting the data,
04:36 it's making sense of that data.
04:38 The stakes are high at the top of the footballing pyramid,
04:41 but lower down, one bad season can have catastrophic repercussions.
04:45 Small clubs with limited budgets can't afford to make a mistake.
04:49 To reduce the risk of acquiring a dud signing,
04:52 they're turning to Bean's sophisticated sabermetric approach.
04:56 So I like to try and get to the training ground as often as I can
04:59 and help out with the guys down there.
05:01 But a lot of the time I'm based here in Eco-Tristy,
05:04 with a full screen set up, I'm surrounded by a lot of energy traders,
05:07 and at times there's a million pounds deals getting made,
05:10 and I'm sat here watching League 2 football and providing analysis,
05:13 so it's a pretty unusual workplace,
05:15 and probably quite different to a lot of analysts in the football league,
05:18 but it's good.
05:19 So here at Eco-Tristy I'm the Chief Operating Officer.
05:22 In energy trading we buy and sell energy, mostly buy,
05:25 to meet the needs of our customers on a day-to-day basis.
05:28 We're able to take a lot of the skills and the data analysis
05:32 that we undertake in the trading front of energy
05:35 into the world of football.
05:37 So we saw it as an opportunity to be creative in the data and analytical space,
05:42 and see if we can form a competitive advantage at a lower level.
05:48 It wasn't really necessarily about budgets,
05:50 but it was about trying to maximise what we can get
05:53 of every single player that we recruit,
05:56 trying to bring together a list of players that is the best
06:00 from the manager's eye, and augmenting that with the data.
06:04 And also performance-wise, we just wanted to understand
06:07 all aspects of our performance.
06:09 So it's essentially doing the same thing in business
06:11 and taking that into the world of football.
06:14 The Billy Bean story is really originally a story about player recruitment
06:18 and finding inefficiencies in the market
06:20 on the back of going against conventional wisdom, really.
06:24 They use data to try and scout players,
06:27 try to find players that no one else wanted,
06:30 that were able to do things that would help the team win.
06:34 Manchester United and Burnley are very different clubs
06:36 despite the fact that they play in the same league.
06:38 And as a result, Burnley has to take a very different approach
06:41 to putting together a team than Manchester United.
06:44 There's a lot of money being spent,
06:46 but for more the mid-level clubs, there should be bargains available.
06:50 So if they're smart with the data,
06:52 and if they look through it with a certain lens,
06:54 they could be able to find some gems out there.
06:57 Yeah, this is where all your goals come from.
07:02 So a lot of them are in the six-yard box.
07:04 We get the ball to you in there, that's where you're dangerous.
07:07 The recruitment side for a small club, like you say, is really, really key.
07:11 And it's important that we're different.
07:13 In January, every club will be after the same players.
07:16 And probably we can't compete for those players that everyone's after,
07:20 so we have to find other types of players,
07:22 and we have a different way of playing.
07:24 And we have to find players that can fit into that,
07:26 and we have to use the data for that.
07:28 Yeah, I think the one I would definitely pick out is Christian Deutch.
07:31 He's been our top scorer last year, he's our top scorer this year.
07:36 I think he's second or third highest goal scorer
07:39 in the top six English leagues in 2017.
07:42 Yeah, I think Christian's done better than we envisaged,
07:47 but we knew that the basics were there.
07:49 We knew he could score goals.
07:51 We knew he got in the right positions on the pitch,
07:54 because his data showed that.
07:55 And it was then a case of us trying to work with him
07:58 how to convert those chances from the positions he got into,
08:02 which his data showed.
08:03 So that one is proof of the data works.
08:06 I mean, the value for money on that one,
08:08 to say we paid £30,000 for him,
08:10 he's worth an awful lot more than that now.
08:13 Tom, who looks after the data, I'll give him a list of targets.
08:18 He'll go through them and give us graphs in terms of their value,
08:22 what they're good at, what they're not good at,
08:24 what their metrics are in terms of if it's a striker, goals, expected goals.
08:28 As analytics evolves, new metrics arrive,
08:31 and some are more widely accepted than others.
08:34 Expected goals is one example of such a seemingly divisive tool.
08:38 So what exactly does it mean?
08:41 It's a measuring tool of the probability of that shot
08:44 from that specific location and resulting in a goal.
08:47 So we look at thousands of different shots
08:50 that occurred in League 1, League 2 and National League,
08:52 so we make it relevant to our level of football.
08:54 We'll then apply where it was on the pitch, the angle, the distance,
08:59 was it a headed shot, was it a shot with the feet, how was it assisted?
09:03 Put all those things into a algorithm that will then produce a number
09:07 which will tell us how likely that is to result in a goal.
09:10 If the expected goal is 0.15, 15% of the time,
09:13 a shot from that location will result in a goal.
09:16 Well, it makes me feel a lot better about myself
09:19 because my expected goals is a lot less than what I'm achieving at the moment,
09:23 so that's good for me.
09:26 I just think football's changing and you've got to...
09:29 Any little inch you can get, it helps out massively,
09:32 and it might be the difference at the end of the season
09:35 between getting promoted or relegated.
09:38 I had nine games without a goal this season,
09:41 and the manager pulled me and said, "Listen, I don't want you to go...
09:45 "I know we're having bad results at the moment,
09:48 "but I don't want you to try and get involved and do stuff which you're not as good at."
09:52 He said, "You're best when you're in the box and you stay the width of the goals.
09:56 "That's where you score your goals."
09:58 I've done that and I've gone on a little bit of a goal-scoring run,
10:01 so that's where the stats have helped me and the manager.
10:04 It tells me where to run and what positions I should get myself into
10:07 to help my game as much as possible and the team.
10:10 Competing against the Premier League's mega-rich requires creative thinking.
10:14 A bunch above their economic weight, Southampton created the Black Box,
10:18 a live database collecting player metrics from every major league.
10:22 This has enabled them to acquire players of undervalued talent
10:25 and sell them on for a profit.
10:27 Sadio Mane, Dejan Lovren, Morgan Schneiderlin, Victor Wan-Yama - the list goes on.
10:33 A lot of the KPIs that we look for the different positions
10:37 is something else that's been consistent for quite a while.
10:40 A lot of the scouts know the type of players that we're looking for at the football club,
10:44 so they'll already be creating scout reports for any players that they've seen up there,
10:49 so they can recommend them to put on our target list
10:51 and someone that we need to look at as a potential sign-in for the football club.
10:55 But we'll also use the data on a global scale to highlight any top performers,
11:01 and from that will be an area that we need to provide some more scouting information on,
11:07 so that will be from the eye, from our scouts.
11:10 Yes, there are some players that will have been signed because their stats look good.
11:14 Payet at West Ham is a good example.
11:16 Gabriel at Arsenal was a good example of that kind of an approach.
11:20 But that's really kind of missing the point.
11:22 The point of analytics is doing things differently.
11:25 One of the reasons for these crazy prices that we're paying for players these days
11:29 is that people get really wedded to one player.
11:32 They really get, they think that this is the guy, we need to have him,
11:36 and we're willing to pay over the odds.
11:38 What data can help you do is generate options.
11:41 Maybe find guys that are kind of like that other guy,
11:44 or maybe who would fit into the team in a slightly different way.
11:48 And it allows you to walk away from a bad deal.
11:50 It allows you to walk away from a really expensive deal.
11:53 Football has actually been collected the most data for the longest time.
11:57 But football is the most complex sport.
12:01 So it's low scoring, it's continuous, it's time varying.
12:04 It's very strategic, okay?
12:06 It's very subjective, so just say you and I were analysing a game.
12:09 We could come up with different opinions.
12:11 When you compare it to other sports like basketball, it's high scoring.
12:15 Tennis and American football, they're segmented.
12:18 Baseball, it's segmented.
12:20 You know, it's very easy to do the analysis.
12:22 You have a lot of data points.
12:23 So the key for football is actually to come up with the right language
12:28 and ask the right question for specific things.
12:31 How was our formation?
12:32 How did we press?
12:34 How were we on set pieces?
12:36 Did we attack by the counter-attack?
12:39 All these different things we have to learn directly from data.
12:42 When I played, it was a video recorder.
12:46 And looking at the game back now, we monitor them every day
12:52 in terms of their sleep, their training, everything they do really.
12:57 It's massive.
12:58 We may know more about the opposition than they actually know about themselves.
13:02 I think as a coach, I can see a certain amount.
13:05 What the data does is just back that up.
13:07 We can look at data of the team we're about to play
13:10 and we can break down strengths and weaknesses of the team that we're playing.
13:15 There was a game a few weeks ago, a game that we actually went on to win.
13:18 In my opponent report, I noticed that the team played pretty deep.
13:22 Their average position was quite deep and their pressing metrics weren't very high.
13:25 So they allowed you a lot of time on the ball.
13:27 I suggested that we'd be able to play a lot of football and we did.
13:30 We passed them to death really.
13:32 I'd also highlighted an area where they were weak and conceded a lot of shots.
13:36 I said if we can get our key players in these areas,
13:38 there's a fair chance we can score from here.
13:40 We actually scored our first goal in exactly that area.
13:43 Data in terms of pre-match, a lot of it is video based.
13:48 But in terms of statistical data, it's used to look at trends.
13:53 So it won't be just from one game, we'll look from game to game
13:57 and build up a database to create a performance profile on that team
14:02 and look at any individuals that are maybe performing to a higher level.
14:06 The black box also helps Southampton develop home-grown talent
14:12 they can sell for huge profit.
14:14 Data helps to drive player recruitment at academy level
14:17 and to maximise the potential of their scholars.
14:20 I started training when I was eight and then finally signed at nine.
14:24 So I'm quite young.
14:27 I think when we first got here it was just a load of numbers on a sheet
14:29 but now we understand what it actually is, the details of it
14:33 and where we can improve and what we need to look at.
14:36 It's helped me massively.
14:37 I think when I first got here I didn't really know what to do.
14:40 Just watching the game I wasn't really taking notice
14:43 but as I started to learn more, I think I focused on myself more
14:47 and the positioning I'm taking up and all the little details
14:50 you can figure out what you have to do to be better.
14:53 So it's helped me massively develop.
14:56 A founding principle of this organisation is youth development.
15:01 It's everything we stand for, excellence, potential, it's a strapline
15:04 it's everything we work towards.
15:06 And even when you buy a senior player, first team player
15:08 still the principle is the same.
15:10 Can we improve him? Because we may be selling him
15:13 and if we are selling him, we need to be selling him for a profit
15:16 so it's all about improving that individual.
15:19 It was never really the dream to produce a young player
15:23 it was never really the dream to produce a player to sell.
15:26 It became the business model when first teams started sliding
15:30 through the leagues and ultimately into administration
15:33 it was selling of players, Theo Walcott and Alex Oxlade-Chamberlain
15:37 and Gareth Bale.
15:39 We all, as fans and also as a staff member here, we all dream
15:43 of what happens if we kept hold of those players, what would he have done?
15:46 But the reality is if we kept hold of those players we would have gone out of business.
15:50 There's a huge amount of data that's collected around the players
15:54 from match day data to the way they sleep, to the way they're feeling
15:58 in the morning, to training their power outputs in the gym.
16:02 The challenge is what do we do with that data and how important
16:06 is it, the analytics around that data.
16:09 So on a daily basis we collect information from players
16:12 from GPS units so we would look at distances covered, the speeds
16:16 at which they're covered and other information such as accelerations
16:20 and decelerations and we would use that in a more individualised approach
16:24 so we can optimally adjust their training programmes to make sure
16:29 that they're fresh and they're in peak condition come match day.
16:34 We're now in an amazing position where for the first time we're able
16:38 to turn down those opportunities to sell players and push back
16:41 against the big clubs and turn around and say no, not for sale.
16:46 It's a huge point of the game now. Obviously there's a lot of other sports
16:52 that use data or heavy analytics. Soccer has not yet cracked,
16:56 I don't think, the code yet in terms of what are the key indicators
17:00 of what's going to make a player successful or not.
17:03 I think there's several companies out there that aggregate the data
17:07 and try to make it easier for you to make a decision.
17:09 But at the end of the day, I think soccer people want to still see
17:12 the player and see how that marries up with the data that you're seeing
17:17 because sometimes the data doesn't always match what you're seeing
17:20 on the field because of the free-flowingness of the game
17:24 and the fluidity of the sport.
17:26 I think the mentality of a player, I think that sometimes the soccer IQ
17:31 and you're only going to get that from seeing sometimes live,
17:34 obviously video as well, but also sitting down with that player
17:37 and having a conversation with them about the game itself,
17:40 about his particular skill set, about your own club's philosophy on the game
17:45 and see if there's a match there.
17:47 And you can't get answers from that with data.
17:51 Analytics has come a long way from past completion rates and heat maps.
17:56 Some of the brightest minds in the game want to find an algorithm
17:59 to calculate the most valuable intangibles, like team chemistry.
18:04 What will this mean for the future of football?
18:07 All goals aren't created equal and the ability to weight the difficulty
18:11 of those goals, the player with the skill set to do those things
18:14 should be rewarded as opposed to a guy who maybe just tapped the one in
18:19 because Suarez drew three defenders on him, penetrating,
18:22 and he flipped it off to him and the other guy just taps it in.
18:25 Well, the goal gets paid for in today's world.
18:28 But shouldn't the guy who created all those things and measuring those things
18:31 is really the challenge? And giving proper credit to player performance
18:34 is what we're all trying to achieve, not just in baseball, but in every sport,
18:37 just like in business.
18:39 So there's lots of cool stuff that people haven't thought about.
18:44 So the idea of ghosting, be able to simulate plays that you haven't seen before.
18:50 So you can have an example of a play and you can say,
18:53 "Well, how does this team defend in that situation?
18:56 What happens if I switch that play with another play?
18:58 How does the outcome change in terms of just body shape?
19:02 Okay, where's the player facing? Are they making the right decisions?
19:05 In terms of injury analytics, player load, fatigue,
19:08 how's their technique changing over time?"
19:10 Now, using deep neural networks, we can actually simulate these things.
19:14 I think in terms of injury prediction, I think you'll find there'll be less injuries.
19:19 So there'll be less soft tissue injuries.
19:21 You're still going to have the edge cases, but soft tissue injuries,
19:25 I think they'll be minimized.
19:28 I think in terms of player valuation, in terms of performance,
19:31 I think that'll be normalized.
19:33 I think you see the volatility now is because we haven't got these good metrics.
19:37 However, what you don't take into consideration is the media.
19:41 The media, the shirt sales, there's all these other things
19:44 that need to be taken into account.
19:46 You're never, I guess, going to have data just making a sole decision, I think, in anything.
19:54 But as data advances and individuals that are part of that process
19:58 and they're creating and maximizing the use of data in clubs and in different sports,
20:04 I think those people are more crucial in the process.
20:08 And I think data becomes more important in what we do from day to day.
20:12 We have to communicate with domain experts.
20:18 And if we can't speak their language, then we're basically not going to be able to identify them.
20:22 It's an exciting area to be in because it's constantly evolving and improving.
20:26 As technology improves.
20:28 The genie is out of the bottle, and I think it's going back down.
20:31 When you've got open-minded people, it works really well.
20:35 Hopefully, it can tell us if we're going to win or lose.
20:37 If the data can tell me we're going to get three points on a Saturday,
20:40 it'd save me an awful lot of work.
20:43 [Music]
20:46 (bubbling)
20:48 [BLANK_AUDIO]

Recommended