Experts vs Data: on the science of predicting SuperRugby performances

With SuperRugby beginning again this weekend, we’ve managed to get a true scientist, Dr Ian Durbach, to comment on the ability to predict outcomes. Ian holds joint positions as adjunct senior lecturer in the Department of Statistical Sciences at the University of Cape Town and a researcher at the African Institute for Mathematical Sciences. His research interests are human judgment and decision making, particularly the way we reason when our choices involve risk or uncertainty. Quite simply, he’s a really clever guy.

When we asked Ian if we could apply science to predicting game outcomes so that we could be consistently correct as rugby experts such as Nick Mallett, he came up with a plan. Ian explains in his own words:

Screen Shot 2015-02-12 at 2.11.30 PM

Picture from supersport.com “Dont miss out, warns Mallett”

The model we’re using is cobbled together from a variety of sources, and uses a few relationships we’ve been able to establish by looking at past performance and past predictions.

First, we looked at past 4 years of Super 15 rugby and worked out how many points each team won or lost by, on average over all the games they’ve played. The Crusaders come top – they have an average margin of +9, meaning they win by an average of 9 points. The table below shows the ranking, along with the average log points – you can see there’s a good but not perfect relationship between point margins and log points.

In our model, we’ve chosen to weight recent years more heavily, which also has the effect of sucking the values of all our predictions towards 0. Those values are in the third column of the table. This is actually a desirable feature to have whenever we know that the past is not a perfect predictor of the future – we’ll spare you the details!

Screen Shot 2015-02-12 at 2.19.32 PM

Average points differences can be used to come up with a quick and dirty prediction. Take this weekend’s game between the Crusaders and the Rebels. We know the Crusaders win by 6 points on average. When they play the bottom teams they’ll tend to win by more, and when they play other top teams they’ll tend to win by less. The Rebels lose by 8 points on average. When they play a top team like the Crusaders, they’ll tend to lose by more than this. You can get a crude prediction by subtracting the two points differences from one another: the Crusaders are predicted to win by 6–(–8) = 14 points. That’s a start.

Next, we found that over a number of years, playing at home is “worth” about 4 points. That is, if you knew nothing at all about an upcoming game, you should predict the home team to win by 4 points. We know the Crusaders are playing at home, so we up the prediction in their favour, by 4 points. They’re now predicted to win by 18.

This is a data-driven prediction. But we know that rugby pundits are pretty good at predicting games. We’ve looked at some past predictions made by 80 000 users of the Superbru game. The players predicted the outcome of 73% of Super 15 games correctly, and were on average 10 points from the actual outcome of the game. Not bad! We found the following Superbru predictions for the upcoming week, which we’ve contrasted with the predictions from our data-driven approach. On the whole its amazing how similar the predictions are. We predict different outcomes for the Lions-Hurricanes and Blues-Chiefs games, but both sets of predictions are for the games to be close.

Screen Shot 2015-02-12 at 2.20.10 PM

An obvious weakness of the data-driven approach is that it ignores a whole bunch of relevant current information – off-season transfers, injuries, form, and so on. Some of this we’ll work on building into later predictions, but for now we can assume that Superbru users are clued up on current rugby events, and use their predictions as proxies for that information. So we’ll make an additional prediction, which is a combination of our naïve data-driven model and the Superbru predictions.

Before we do that, an interesting fact is that Superbru users tend, on average, to slightly underestimate the effect of playing at home. They think it is worth around 2 points, while we know that it is actually worth about 4 points. So, if we know Superbru users think about a game, we should adjust their predictions by 2 points in favour of the home team before combining them. For now we’ll combine them in the simplest possible way, but taking the average of the two.

That’s all we’re going to do for now. Once we start getting some information in on how the teams are performing, we’ll use this information to adjust our model further. Let’s see how we do!

Screen Shot 2015-02-12 at 2.20.17 PM

Mallet seems to think Chiefs will win when he referred to the team being “good again”. Who do you think will be proved correct? The experts (rugby and Superbru/couch) or the scientists?  We’d like to know your thoughts before the Blues vs Chiefs match kicks off. Tell us in the poll or feel free to comment!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s