Fitba Fancy Stats: The Problem with Group Stage Results

Celtic bombed out of the Europa League group stage last Thursday after a draw away to Fenerbahce. The Hoops finished last in Group A with a dismal 3 points from 6 matches. In the aftermath, manager Ronny Deila has been criticized harshly by fans and pundits alike.

However, as I will show in this post, 6 matches is not a big enough sample to judge a manager's results in any competition. This is because of the inherent randomness of football.

To illustrate this fact, I generated random distributions of points per game (PPG) and other metrics using bootstrap resampling in R with 100,000 replicates. The samples were drawn from Celtic's SPL/SPFL league matches over ten seasons between 2004 and 2014 (N=380).

Let's begin by looking at the bootstrap distribution of PPG after 36 matches, which is roughly equivalent to a full domestic league season, or 6 times the sample size of Europa League group stage.

As can see in Figure 1, the distribution is nearly bell-shaped and smooth. The center of the curve lines up nicely with the 10-season average for Celtic (2.321), as indicated by the vertical red line. Thus, PPG after 36 matches is a good representation of Celtic's long term results domestically.

Figure 1.

Now let's look at the distribution of PPG after 6 matches (Figure 2). The main thing to note is how different the 6 match and 36 match distributions are from each other. The 6 match distribution does not have a smooth, bell shape centered on the true historical average. Instead, it's spikey and wild looking, What this means is that, PPG after 6 matches is NOT a good representation of Celtic's long term results domestically.

The relatively high volatility of PPG after 6 matches observed in Celtic's domestic league matches is a pattern that would also apply to the Europa League group stage. However, it is important to note that this inherent volatility would be even greater in Europe, due to the overall similarity in quality between the teams compared to SPL/SPFL.

Figure 2.

So what alternative do we have to judge a team's performance after 6 matches? Figures 3 and 4 show two options, goal differential (GD) and total shot differential (TSD).

As you can see in Figure 3, the distribution of GD after 6 matches is much more bell shaped than the 6 match distribution of PPG, and the former is more closely aligned with Celtic's 10-season average (1.416). However, the 6 match GD distribution is still very spikey, rather than smooth like the curve in Figure 1. Thus, 6 match GD is not a very good representation of Celtic's long term performance in domestic league competition.

Figure 3.

Finally, let's have a look at the bootstrap distribution of TSD after six matches. As you can see in Figure 4, the curve is remarkably similar in shape to the 36 match PPG distribution. It is smooth, bell shaped and centered nicely on the 10-season average (+6). Thus, 6 match TSD is a good representation of Celtic's long term performance domestically.

This is an important point because Celtic's TSD improved significantly this season in the Europa League group stage compared with last season, going from -22 to +2. Given the volatility of short term results, Celtic's abysmal PPG this season could be written off as a consequence of random variance. While the team's improvement in TSD is more likely a result of genuine and sustainable improvement under manager Ronny Deila.

Figure 4.

Fitba Fancy Stats

Monday, December 14, 2015

The Problem with Group Stage Results

No comments:

Post a Comment