Monday, December 28, 2015

How Good is Liam Boyce?

Northern Ireland international Liam Boyce scored a hat-trick for Ross County Saturday in a 5-2 victory over Dundee in the Scottish Premiership. According to Soccerway, Boyce has 13 league goals in 1632 minutes so far this season, or 0.72 goals per 90.

In 2010, Boyce was on trial at Celtic but did not receive a contract. This is unfortunate because, as I will argue in this post, Boyce might be as good a goal scorer as any striker on Celtic's current roster.

FIGURE 1

Figure 1 shows the relationship between career goals scored and minutes played for Boyce, and Celtic's current strike force (data from soccerway.com). The black line represents the expected number of goals for a given number of minutes played. Most of Boyce's seasons fall on or above the line, indicating a relatively high level of goal scoring productivity compared to Celtic's strikers, In fact, Boyce's 0.84 goals per 90 at Cliftonville FC in 2012-13 is better than any single season performance by any of Celtic's current strikers, with the exception of Leigh Griffiths this season and last season.

Figure 2, shows the individual trend lines for each player's career. As you can see, Boyce's line has the most positive slope, indicating a higher goal scoring rate at 1500+ minutes played than any of Celtic's strikers, even Leigh Griffiths.

FIGURE 2

One could argue that most of Boyce's goals have come playing against part-timers in a "diddy" league, the NIFL Premiership. While this is an important caveat, in my opinion, Boyce is worth another look, even given the limited data presented here.

A related question is: what did Celtic see or not see in 2010 that turned them off? This seems an especially poignant question given the parade of disappointing strikers that have been given contracts by Celtic since 2010 (Mo Bangura anyone?).

Monday, December 14, 2015

The Problem with Group Stage Results

Celtic bombed out of the Europa League group stage last Thursday after a draw away to Fenerbahce. The Hoops finished last in Group A with a dismal 3 points from 6 matches. In the aftermath, manager Ronny Deila has been criticized harshly by fans and pundits alike.

However, as I will show in this post, 6 matches is not a big enough sample to judge a manager's results in any competition. This is because of the inherent randomness of football.

To illustrate this fact, I generated random distributions of points per game (PPG) and other metrics using bootstrap resampling in R with 100,000 replicates. The samples were drawn from Celtic's SPL/SPFL league matches over ten seasons between 2004 and 2014 (N=380).

Let's begin by looking at the bootstrap distribution of PPG after 36 matches, which is roughly equivalent to a full domestic league season, or 6 times the sample size of Europa League group stage.

As can see in Figure 1, the distribution is nearly bell-shaped and smooth. The center of the curve lines up nicely with the 10-season average for Celtic (2.321), as indicated by the vertical red line. Thus, PPG after 36 matches is a good representation of Celtic's long term results domestically.


Figure 1.


Now let's look at the distribution of PPG after 6 matches (Figure 2). The main thing to note is how different the 6 match and 36 match distributions are from each other. The 6 match distribution does not have a smooth, bell shape centered on the true historical average. Instead, it's spikey and wild looking, What this means is that, PPG after 6 matches is NOT a good representation of Celtic's long term results domestically.

The relatively high volatility of PPG after 6 matches observed in Celtic's domestic league matches is a pattern that would also apply to the Europa League group stage. However, it is important to note that this inherent volatility would be even greater in Europe, due to the overall similarity in quality between the teams compared to SPL/SPFL.

Figure 2.


So what alternative do we have to judge a team's performance after 6 matches? Figures 3 and 4 show two options, goal differential (GD) and total shot differential (TSD). 

As you can see in Figure 3, the distribution of GD after 6 matches is much more bell shaped than the 6 match distribution of PPG, and the former is more closely aligned with Celtic's 10-season average (1.416). However, the 6 match GD distribution is still very spikey, rather than smooth like the curve in Figure 1. Thus, 6 match GD is not a very good representation of Celtic's long term performance in domestic league competition.

Figure 3.


Finally, let's have a look at the bootstrap distribution of TSD after six matches. As you can see in Figure 4, the curve is remarkably similar in shape to the 36 match PPG distribution. It is smooth, bell shaped and centered nicely on the 10-season average (+6). Thus, 6 match TSD is a good representation of Celtic's long term performance domestically.

This is an important point because Celtic's TSD improved significantly this season in the Europa League group stage compared with last season, going from -22 to +2. Given the volatility of short term results, Celtic's abysmal PPG this season could be written off as a consequence of random variance. While the team's improvement in TSD is more likely a result of genuine and sustainable improvement under manager Ronny Deila.

Figure 4.




Thursday, August 27, 2015

A New Team Rating for Scottish Football

Introduction

Everybody knows that the league table is bollocks at the start of the season. This is because the sample size is too small to draw any confident conclusions about team strength.

While the outcome of any given football match can be heavily influence by luck, we assume that over time the best teams will drop the fewest points, and rise to the top of the league table, as luck evens out.

The problem is that a 38-game football season is not long enough for luck to completely even out. Mark Taylor has recently estimated that 152 matches would result in a league table that better reflects the true differences in quality between teams.

Obviously that is never going to happen. So, over the course of a season, if we really want to know how good/bad a team has been, we need a way to estimate team strength without relying on the league table.

Team Strength is a Latent Variable

Team strength is an example of what statisticians call a latent variable, because it is not measured directly; it is inferred from other variables that are measured, such as shots or expected goals (xG).

In my own work, I have mostly used shots on target ratio (SoTR) to infer team strength, but given that team strength is a latent variable, a more appropriate approach might be principal component analysis (PCA).

In a nutshell, PCA takes a set of variables and transforms them into an equal number of weighted linear combinations (components), such that the first principal component (PC1) summarizes most of the variance in the original data set.

For an example of PCA applied to football data, see Martin Eastwood's excellent article here.

Using PCA to Estimate Team Strength

Here is how I developed my new PCA-based Team Rating.

First, I downloaded 10 seasons of Scottish top-flight match data (2004-2014) from football-data.co.uk. Next, I aggregated the data by season and team, resulting in a sample size of 120 team-seasons. Then, I performed a PCA in R (prcomp function) using the following four variables (scaled to unit variance).

  • Total Shots Ratio = Shots For/(Shots For + Shots Against)
  • Shots on Target Ratio = SoT For/(SoT For + SoT Against)
  • Score Rate = Goals For/SoT For
  • Save Rate = 1-(Goals Against/SoT Against)
(Note: Score Rate + Save Rate = PDO)

PC1 accounted for 56% of the variance (see Appendix), and was strongly associated with the two most reliable metrics in the data set, TSR and SoTR (loadings > 0.60). Thus, PC1 appears to represent the latent variable team strength.

Interestingly, Score Rate and Save Rate were also positively associated with PC1, but with relatively weak loadings. This is consistent with the view that, although relatively volatile, PDO does contain some information about team strength, just not as much as TSR or SoTR.

Assuming that PC1 presents team strength, a team's PC1 score can be used as a composite rating. Thus, my Team Rating is essentially a weighted linear combination of a team's TSR, SoTR, Score Rate, and Save Rate, with Score Rate and Save Rate having roughly 1/3 the weight of the shot ratios. (James Grayson has developed a similar composite metric, which you can read about here.)

And there you have it: my new Team Rating.

Explanatory Power and Reliability




The scatter plot above depicts the relationship between my new Team Rating, goal difference, and points per game at the end of the season for the 2004-2014 data set.

As you can see, my new Team Rating has excellent explanatory power, with r^2 values of 94% and 89% for goal difference and points respectively. In contrast, SoTR only explains 80% and 73% of the variance in goal difference and points respectively.

In addition, the line of best fit relating Team Rating to goal difference goes through zero on both axes. This means a team with a rating of 0 at the end of the season is expected to have a goal difference of 0.

Also, the Old Firm seasons in the upper right quadrant of the plot fall out around the same regression line as the rest of the league, which is nice.



The second scatter plot depicts the same relationships as the previous plot, this time using four seasons that were not included in the PCA (I used the predict function in R to calculate the PC1 scores for the new data). Again, we see the same excellent explanatory power, even when new data were used.

To assess within-season reliability, or "repeatability" of team differences, I examined the correlation between Team Rating in the first half of the season and Team Rating in the second half of the season in the four new data sets.



As you can see from the bar chart, my new Team Rating exhibits a similar level of within-season reliability as TSR or SoTR, and is more reliable than either goal difference or points per game.

Team Rating in Action




The bar chart above depicts the current Team Ratings for the SPFL Premiership. As you can see, Hearts are rated as the 3rd best team, despite currently sitting at the top of the table.

The reason for this discrepancy is that the Jam Tarts' results so far have been heavily influenced by PDO, specifically an unsustainable Score Rate of 0.67. Basically, they've been lucky compared to the other teams in the league, but Hearts are still very good.

Thanks for reading. If you have any questions about my Team Rating, please post a comment below, or you can find me on Twitter @226blog.

Appendix



Tuesday, July 21, 2015

Whose headers are better?

After watching Nadir Ciftci blaze several close-range headers off target during the first leg of Celtic's Champions League qualifier versus Stjarnan FC, I thought it would be interesting to take a closer look at headed shot attempts by strikers in the SPFL Premiership last season to answer the question, whose headers are better? (long story short, it ain't Ciftci)

The scatterplot below shows the relationships between total headed shots, headed shots on target and headed goals (all per 90 minutes played) for 41 strikers who played at least 500 minutes last season. All the data come from either BBC match reports (for shots) or Soccerway (for minutes played).

Click to Enlarge

The first thing to notice about the plot is that goals p90 (dot size) is only loosely correlated with total shots and shots on target. This is due to the fact that differences in shot conversion rate are mostly random in the short-term.

For example, if we just look at goals, we might conclude that 5'8" Leigh Griffiths (Celtic) is one of the best in the league at headed shot attempts. I don't know any Celtic fan who would agree with that statement. A better explanation is that he got lucky with the relatively few attempts he took last season. Thus, goals p90 is a not a very useful metric for judging whose headers are better.

Turning our attention to the more useful metrics, total shots and shots on target, we can see from the plot that three strikers really stand out: Josh Magennis (Kilmarnock), Brian Graham (formerly of St Johnstone) and Edward Ofere (formerly of Inverness Caledonian Thistle).

Ofere played relatively few minutes for ICT, so I don't consider his data to be very reliable. That leaves us with Graham and Magennis. Of the two, I would regard Magennis as the better header since his accuracy was much higher than Graham's. The latter got roughly as many headed shots on target as you would expect from the number of shot attempts (hence his dot is close to the regression line). Magennis, on the other hand, got many more headers on target than expected (hence his dot falls way above the line). I think it's safe to say the Northern Irishman was probably unlucky not to score more headed goals than he did last season.

So whose headers are better? In my opinion, the answer is Kilmarnock's Josh Magennis. Don't believe me? Just ask is team mate Jamie Hamill...


Josh Magennis demonstraing his excellent head shot accuracy

Monday, July 6, 2015

How long does it take conversion rate to stabilize?

Conversion rate (goals per shot attempt) is known to be a highly volatile aspect of performance that can fluctuate randomly within and between seasons. Conversion rate also tends to converge to a typical value over time (often called, regression to the mean).

This does not mean that conversion rate is soley luck based however. But the skill component may only be evident over a relatively long period of time. In other words, we may have to observe a lot of shot attempts before the skill component of conversion rate emerges. The question is how many?

In this post I will attempt to discover how many shots are needed before we can be confident that the skill component of conversion rate is evident. I will examine both conversion rate for and against using data from football-data.co.uk for the top division of Scottish football (SPL/SPFL) over ten seasons (2004-2014). My analysis is limited to six teams who played in the top division every season during this period, these are: Aberdeen, Celtic, Dundee United, Motherwell, Kilmarnock, Hearts, and Hibs.

My basic approach will be to plot cumulative conversion rate (for/against) vs cumulative shot totals at the match level, and to visually inspect these plots for a point or region where the volatililty seems to subside.



Click to Enlarge

Click to Enlarge


The take home message from these charts is that it takes a looooooooong time for conversion rate to stablize to a point where we can reasonably argue for team differences in skill. We are talking about shot numbers in the thousands, not hundreds (2000-3000, depending on whether the shot is for or against).

The exception to this rule is Celtic's conversion rate for. Becaue they are the biggest club in Scotland with the biggest budget, they generally have much higher quality players. As such, their conversion rate for is distinctly high, even at relatively low shot numbers. This is clear evidence of a skill-based difference in conversion rate.

Celtic aside, these charts highlight the huge importance of random variance (luck) in short term football results. Given that the typical Scottish top flight team will attempt/concede 300-400 shots per season, it is possible that a team could be lucky/unlucky for several seasons in a row, despite common wisdom to the contrary. Also, we should not necessarily expect teams to fully regress to the mean within a single season. Regression to the mean may take many seasons.

Lastly, conversion rate against appears to take much longer to stabilze than conversion rate for, which may explain why goal-keeper performance is so hard to predict.

Friday, May 1, 2015

How Good are Aberdeen?

Aberdeen have the opportunity to extend the SPFL Premiership title race tomorrow if they can beat Dundee United at Tannadice. The fact that Celtic have not yet won the league is a testament to how good Aberdeen have been this season.

But how good are Aberdeen really? How does the current season compare to previous Aberdeen seasons? How do the Dons compare to other non-Old Firm teams of the recent past?

In this post, I will look at the past 10 years of the Scottish top division to answer these questions.

First, let's see how the current Aberdeen team compares to the previous ten Aberdeen seasons with regard to basic stats versus non-Old Firm teams.

Table 1. Basic Stats

Click to Enlarge


As you can see from Table 1, Aberdeen this season are better than any previous Aberdeen team going to back 2004-05 in every basic attacking and defensive stat, except shots aginst. This exception is likely a consquence of "score effects," because Aberdeen have taken early leads on several occasions this season, and teams playing from behind shoot a lot.

Table 2. Advanced Stats

Click to Enlarge

Table 2 combines the data from Table 1 into metrics that provide a more advanced view of team performance (click here to see variable defintions).

The take home message from these stats is that Aberdeen this season are more dominant than any previous Aberdeen team since 2004-05. This is indicated by Total Shots Ratio (TSR) and Shots on Target Ratio (SoTR) values of 0.60 and 0.62 respectively.

To put these exceptional numbers into a broader context, consider these facts:

  • Of 102 non-Old Firm top division seasons between 2004-2014...
  • Only 2 achieved a TSR > or = 0.60 against non-Old Firm teams
    • Hearts, 2012-13, 0.61
    • Hibernian, 2006-07, 0.61
  • Only 3 achieved a SoTR > or = 0.60 against non-Old Firm teams
    • Hearts, 2005-06, 0.61
    • Hearts, 2011-12, 0.63
    • Hibernian, 2006-07, 0.61
Shot ratios are important because they are the most repeatable aspect of team peformance, i.e., least influenced by random variation or luck, both within and between seasons. The fact that Aberdeen's shot ratios are so high this season bodes well for next season.

So how good are the Dons this season? Really fucking good.

ICT Analytics: The Terry Butcher Years

It was announced yesterday that former England captain and more recently Hibernian mananger Terry Butcher will become the new manager at League Two side Newport County AFC in England.

Because I have the data and because some Newport County supporters might be interested (I know at least one), I thought I would post some stats for Butcher's top flight years at Inverness Caledonian Thistle (ICT).

As you can see from the tables and figure below, Butcher enjoyed consistent success at ICT, with the Jags showing steady improvement in Shots on Target Ratio under the Englishmen.

Newport County fans can expect moderate productivity in terms of shots and goals. So not the most attractive approach, but effective nonetheless in terms of points per game; roughly one win every two games is not a bad return.


Analytics of the Terry Butcher years at ICT
Excluding matches against the Old Firm

Click to Enlarge

Wednesday, April 29, 2015

An Analytical Perspective on Referee Bias

Celtic supporters have a reputation for being conspiracy theorists when it comes to referees.

I'll admit, it's hard not to think referees are biased when an opposing defender deliberately uses his hand to stop a ball in the box and two referees looking directly at the incident fail to award a penalty, which contributes to Celtic losing the game and a chance at the treble!

There are countless other incidents and decisions that have gone against Celtic over the years. According to some supporters, these examples constitute a body of evidence in support of the view that referees in Scotland are biased against Celtic.

But anecdotes such as these are hardly objective evidence, and there is always the potential for confirmation bias in the use of such anecdotes.

Fortunately, there is ample publicly available data on Scottish referee decisions at sites like football-data.co.uk. So we don't have to rely just on anecdotes; we can assess the patterns rigorously and quantitatively with analytics.

I will focus in this post on fouls because awarding fouls is the most frequent referee action. As such, fouls give us the best opportunity to detect bias (signal), as opposed to random variation (noise). Bottom line: if we can't find evidence of referee bias in fouls, we are unlikely to find it in cards or penalties, which have much smaller sample sizes.

The metric I will focus on is Fouls Ratio, or

Fouls Conceded/(Fouls Conceded + Fouls Awarded).

Teams that tend to conceded more fouls than their opponents will have higher Fouls Ratios than teams that tend to concede fewer fouls. Note, there is no single expected value for Fouls Ratio, such as 0.50. The value will depend on the strategies, tactics, and discipline level of each team.

Now for the obligatory colorful charts.


Figure 1
Click to Enlarge

Figure 2
Click to Enlarge


Figures 1 and 2 plot the Fouls Ratio for each league match for both Celtic and Rangers going back to the 2000-01 season. As you can see, whether the match is home or away, the pattern looks completely random, no discernible trends or predictable oscillations.

Also notice that the lines overlap substantially between Celtic and Rangers. The mean Fouls Ratios for Celtic, home and away, are 0.44 and 0.47 respectively, compared to 0.45 and 0.49 for Rangers. These relatively minor differences are nonetheless statistically significant in the case of away matches.

Thus, if anything, referees appear to be biased against Rangers, not Celtic. However, I suspect these differences reflect real differences in discipline between the two clubs, rather than referee bias. It's important to note also that both Celtic and Rangers have below average Fouls Ratios compared to the league as a whole (pro Old Firm bias anyone?).

So there doesn't appear to be a detectable bias against Celtic in fouls overall, but what about individual referees?

The next two charts plot Fouls Ratios for Celtic and Rangers by referee while taking into consideration the number of games each referee has worked. The number of games worked is essential information because random variation in Fouls Ratio is expected to decrease as the number of games increases. Indeed, this appears to be the case.


Figure 3
Click to Enlarge



Figure 4
Click to Enlarge


As you can see from the charts above, Fouls Ratios tend to converge to a similar value for both clubs as the number of games a referee has worked increases. This is an example of regression to the mean. No individual referee biases can be detected in these plots for referees that have worked 10 or more games. All of the referees tend to converge on a single value, which is the mean Fouls Ratio for the team, be it Celtic or Rangers.

It's interesting to note that because of typos in the source files, some of the data points represent the same referee (i.e., Hugh Dallas, H Dallas). Because of these data entry errors, we can clearly see how Fouls Ratios regress to the mean as the number of games worked increases.

Take-home message: there is no detectable evidence of bias against Celtic by referees in Scotland, at least in regards to the proportion of fouls conceded in league matches between 2000 and 2014.

While there may be some referees who hold personal biases against Celtic, or other clubs, these biases do not appear to influence their behavior on the pitch in any reliable way.

Moreover, given the randomness of variation in Fouls Ratio over time, it appears that refereeing decisions do even out in the long run.

Friday, April 24, 2015

The Most Underrated Team in Scotland

Scottish Cup finalists Inverness Caledonian Thistle (ICT) play former SPFL Premiership "title challengers" Aberdeen tomorrow in the Highlands.

A few of us have been talking up ICT on Twitter for a while now. I was even ridiculed by one Twitter pundit for daring to suggest that ICT are just as good as Aberdeen, if not better. 

When ICT subsequently drew and then won against Celtic in the Premiership and Cup respectively, I admit I did feel more than a bit vindicated (obviously my ego trumps my club allegiance).

Looking at the numbers behind the Premiership season so far, I have concluded that ICT are the most underrated team in Scotland, and in this post I'll explain why.

The table below presents performance data for both sides during Aberdeen's "title challenge," from the start of the New Year until their 4-0 defeat to Celtic. As usual, I will focus on the two main drivers of goal dominance, Shots on Target Ratio (SoTR) and PDO, as well as their components (click here to see how these variable are defined).


Click to Enlarge


As you can see from the table, with regard to shots on target (SoT), ICT were actually better than Aberdeen overall during this period of the season; The Jags had a higher SoTR driven by a superior defense (lower SoT conceded per game). Interestingly, despite Aberdeen's much vaunted attack, the number of goals scored per SoT (score rate) was virtually identical for both teams.

The only aspect of performance in which Aberdeen was significantly better than ICT was save rate (goals per SoT conceded). The Dons' keeper Scott Brown was as good as Craig Gordon (Celtic) or Alan Mannus (St. Johnstone) during this period. However, unlike Gordon and Mannus, Brown was not able to sustain such a high level of performance, and was subsequently dropped for Jamie Langfield. Ouch.

In summary, during Aberdeen's two-month title challenge, ICT performed equal to or better than the Dons in every important performance metric, except save rate. So why were/are the Jags so underrated by pundits and supporters alike?

Here are three possible reasons that I'm throwing out there as food for thought rather than definitive answers.

  1. The Table-Never-Lies Fallacy: This is the mistaken idea that results are a true measure of a team's quality. Aberdeen are in 2nd place and ICT in 3rd, so the Dons must be better than the Jags, right? Wrong. Football doesn't work that way. There is too much luck involved. If you don't believe me, then I suggest you read The Numbers Game by Anderson and Sally.
  2. The Wee-Club Bias: ICT are a small, provincial club, one of only two Highland clubs in the SPFL Premiership. Aberdeen, in comparison, are a much larger club, part of the so-called New Firm (along with Dundee United). Bigger is better, right? Not necessarily.
  3. Wengeritis: This is a mental disease that afflicts many football supporters. It leads to the dogmatic view that "attractive" football is superior to "winning ugly." ICT have been described as "dire" by some afflicted with Wengeritis. Although, to be fair, they do play a form of attacking football, just not the one preferred by some proponents of the Beautiful Game apparently. Aberdeen, in contrast, have been playing some very attractive, free flowing football. Thus, in the minds of some purists, ICT can never be as good as Aberdeen, because the Dons play the "right" way and ICT play the "wrong" way.
Again, these cheeky explanations for ICT's underrated status are meant only as food for thought. 

Here's another morsel for your mind: Who do you think will do better in Europe next season, ICT or Aberdeen? I know who I'm picking.




Friday, March 13, 2015

Dundee United's improbable scoring rate

The League Cup Final match-up this weekend pits the two most potent offenses in the Scottish Premiership against each other.

Celtic have scored 56 goals, the most in the top-flight, while Dundee United are second best at 49 goals.

The attacking effectiveness chart below illustrates the very different paths taken by Celtic and United in achieving their current goal tally (dashed lines are current league medians).


As you can see from the chart, United have above average shot accuracy (SoT per Shot) and scoring rate (Goals per SoT), but average total shots.

In contrast, Celtic have below average accuracy, above average scoring rate, and above average total shots (*in fact, no Celtic team since 2000 has taken as many shots at the same stage of the season).

While both teams have above average scoring rates, United's rate of 0.398 goals per shot on target is remarkably high.

The chart below shows the distribution of scoring rates through 26 games in the Scottish top-flight between 2004-2014. United's previous rates are also plotted (black dots), along with their current rate (red dot).



As you can see, United's current scoring rate is at the upper extreme of the distribution. It is higher than any United team since 2004. In fact, only two (out of 120) teams since 2004 have had higher scoring rates through 26 games, Celtic 20005-06 and Kilmarnock 2005-06. Thus, 98.4% of the distribution is less than the current United scoring rate.

Most analysts would regard this exceptional level of finishing as unsustainable, and perhaps even lucky. Scoring rate is after all one of the two components of PDO, which is known to be highly volatile and unpredictable.

Hopefully, if you're a Celtic fan, United's luck will run out on Sunday.


Tuesday, February 10, 2015

Score effects in European football

It has been documented previously that the score of a match, or game state, influences shot differentials/ratios. Specifically, when a match is close (1 goal difference or less), the team trailing tends to out-shoot the opposition, and the team in the lead tends to get out-shot.

Such "score effects" have been attributed to tactics; teams with a narrow lead tend to go into a defensive shell to protect what little they have, while teams trailing by a narrow margin desperately attempt to secure an equalizer by becoming more attacking.

In this post, I examine score effects in 6 top-flight European football leagues this season, English Premier League, La Liga Primera Division, Bundesliga, Seris A, Ligue 1, and of course the Scottish Premiership.

Specifically, I will look at how half-time (HT) score influences full-time (FT) shot differentials in matches that end close. All of the 2014-15 data used in this post were downloaded from football-data.co.uk and analyzed with R.

Let's start with the English Premier League. Note: the thick line in the middle of each box is the median shot differential for teams in that game state at HT.


As you can see, the data follow the expected pattern, teams down 1 goal at HT tend to out-shoot the opposition by FT, and teams in the lead at HT usually end up with negative shot differentials.

This pattern matches the results of previous studies based on more detailed game state data for the Premier League.

From what I can tell, it seems like very little work has been published on score effects outside of the Premier League. So I thought it would be interesting to see if the same pattern holds in other top-flight European leagues.

Below are box plots for La Liga, Bundesliga, Serie A, and Ligue 1.






Remarkably, these charts all show the same pattern as the Premier League; in close games, teams in the lead at HT tend to play more defensively, resulting in negative shot differentials at FT. Conversely, teams down by 1 at HT tend to go on the attack, and end up out-shooting the opposition by FT.

Given this evidence, score effects would appear to be ubiquitous in European football. However, when I examined the data for the Scottish Premiership, I discovered an unexpected pattern.


Unlike the other European leagues, teams up 1 goal at HT in the SPFL typically end up with positive shot differentials at FT. Conversely, SPFL teams down 1 goal at HT typically end up with negative shot differentials.

This pattern holds even if you remove the small number of close matches Celtic have played, and if you analyze home and away teams separately.

So what is going on in Scotland?

It would appear that SPFL teams use different tactics than  teams in other European leagues, especially when leading by 1 goal at HT. Instead of going into a defensive shell to hold onto a narrow lead, teams in Scotland typically continue to attack, to try to get a second or third goal.

It wasn't this way last year, as the box plot below illustrates.


Last year, the SPFL followed the typical pattern seen in other European leagues, with teams down 1 goal at HT going on the attack more than teams up by 1 at HT.

Many in Scottish football regard the current season as one of the most competitive and entertaining in recent memory, certainly in the post-Rangers era. Part of that perception may be due to an attacking style of play that is unusual these days in European football.