Over the past couple of weeks, Fantasy Football Scout (or at least, the geeky corner of it in which I reside) has been busy discussing the relative merits of chance conversion rate, and other predictive measures of goalscoring performance.
The first article I wrote on this topic highlighted a number of teams who had underperformed last season in terms of goals scored based on their number of chances, in particular Southampton and Manchester United.
These teams generated chance-conversion (CC) rates that were more than one standard deviation below the league mean – yet both had performed at or above the league average in previous seasons.
In looking at how repeatable chance conversion rate is from year-to-year, I found that CC rate in 2015/16 was wholly uncorrelated (R2=0.002) with CC rate in 2016/17.
But this was a small sample size from which to work.
As noted by Deulofail (the first of many members I would like to thank for their input in advancing this discussion), using a running-three year tally of team chance conversion rate, should in theory enable yield better results.
This hypothesis was proven true – over longer periods of time, team level chance conversion rate is somewhat more predictive of future performance. For example, cumulative chance conversion rate from 2013/14 – 2015/16 (a three-season sample) had a correlation coefficient of 0.37 vs data from the 2016/17 season. Whilst far from an absolutely predictive signal, this number is important for people to note in future discussions on this topic moving forward.
The impact of chance quality
One side-conversation that grew from these datasets (thanks here to Green Windmill, Giggs Boson, and Doosra, among others, for their input) related to the types of shots teams were taking. For example, were some sides tactically disposed to take longer-range shots and, if so, did this affect their chance conversion rate on a team level?
To test this, I ran a series of linear regressions and Spearman’s rho calculations to assess whether teams who took the most shots from outside of the box also had lower chance conversion rates as a result. To prevent against single season biases, I collected data across the past four seasons, which are presented below:
Season | Correlation coefficient (R2) | Spearman’s Rho (ρ) |
2016/17 | 0.1 | 0.34 (two-tailed p=0.18) |
2015/16 | 0.27 | 0.63 (p=0.01) |
2014/15 | 0.17 | 0.25 (p=0.38) |
2013/14 | 0.12 | 0.23 (p=0.42) |
As demonstrated, shot distance on the team-level appears to have a consistent, but relatively small negative relationship on chance conversation rate: as a greater percentage of shots are taken from long-range, chance conversion rate declines, and vice versa.
However, this impact is significantly less than I had initially expected to find – in only one season (2015/16) was the relationship between distance-shooting percentage and chance conversion rate statistically significant to even a conservative p<0.1 threshold. There’s clearly more to this equation than simply shot location alone.
The ‘Big Six’ Effect
A second conversation related to overall team quality. Specifically, did top Premier League sides convert chances at a better rate than others – due either to their generating better-quality chances, or possessing strikers who were more clinical finishers?
To test this, I divided my four-year PL sample into two groups: the “Big Six” (Arsenal, Chelsea, Liverpool, Manchester City, Manchester United, Tottenham) and the eight other sides who have featured in the PL across each of the last 4 years (Crystal Palace, Everton, Southampton, Stoke, Sunderland, Swansea, West Brom, West Ham).
Here are their relative chance conversion rates over the past four seasons:
Season | “Big Six” | Other sides | “Big Six” effect |
2016/17 | 12.5% | 10.7% | +1.8% |
2015/16 | 11% | 10.4% | +0.6% |
2014/15 | 11.5% | 9.7% | +1.8% |
2013/14 | 9.9% | 9.5% | +0.4% |
In each of the past four seasons, the so-called “Big Six” sides have converted chances at a higher level than the rest of the league. However, the extent of this difference has been variable: in 13/14 and 15/16, the two groups were much closer in terms of performance than last season.
This demonstrates that we should recalibrate our expectations for a side’s chance conversion rate, to some degree, based on the quality of the team’s players.
To me, this finding provides yet further evidence in favour of a Manchester United bounceback heading into next season – their 9.1% chance conversion rate looks especially poor in light of these numbers.
Chance conversion rate on the individual level
One significant problem with using team-level data to assess chance conversion rate is the ratio of signal to noise (thanks to Twisted Saltergater in particular for insightful comments on this issue, and interesting data on baseball analytics).
So many potentially obfuscating factors impact upon the ability of a team to create, and eventually to convert, goalscoring chances. Moreover, year-to-year changes are more significant on the level of the team than on the level of an individual player – making future performance more difficult to predict from past results.
There is less statistical noise – though obviously still a significant degree of inexactitude – when analysing the performance of single players over multiple seasons.
First and foremost, how predictable is single-player chance conversion rate from one season to the next? The good news would seem to be that this number is stronger than on a team-level. Considering all FPL players who took attempted at least 50 shots in the Premier League last season, the correlation coefficient between their 2016/17 and 2015/16 performance levels is 0.21 (much more convincing than the 0.002 on the team-level).
If we also include data from the 2014/15 season – i.e. using the average of a player’s preceding two seasons to predict his chance conversion rate in the third – that coefficient increases to 0.31.
In practical terms, this dataset can also help us to identify outliers in terms of player performance. For example, let’s use these numbers to highlight some of the Premier League’s most proficient finishers over the past three seasons. Listed below are the five players whose three-year average plotted more than one standard deviation above the mean performance of the 35 most prolific players over this timeframe.
Player | Average shots | Average goals | Mean CC rate (%) |
Diego Costa | 85 | 17 | 20.6 |
Harry Kane | 127 | 25 | 20.3 |
Jamie Vardy | 72 | 14 | 18.6 |
Sadio Mané | 68 | 11 | 17.4 |
Sergio Aguero | 135 | 23 | 17.4 |
This dataset seems to do a pretty good job of identifying players whom the eye-test would also mark out as being clinical finishers.
The presence of Sadio Mané on that list is very interesting – with a three-year chance conversion rate to match Sergio Aguero, the Senegalese international has shown both consistency and quality in front of goal over an extended period of time. That could be tempting for fantasy managers looking for a goalscoring midfield option heading into next season.
And what about the players who shoot a lot, but whose individual CC rates are more than 1 standard deviation below the mean? Note – Dimitri Payet initially qualified for this list (8.3%), but as he is no longer in the Premier League I have removed him and inserted the next-worst player in his place.
Player | Average shots | Average goals | Mean CC rate (%) |
Ross Barkley | 78 | 5 | 5 |
Christian Eriksen | 110 | 8 | 7.4 |
Phillip Coutinho | 107 | 9 | 8.1 |
Dusan Tadic | 58 | 5 | 8.2 |
Marko Arnautovic | 58 | 6 | 9.3 |
Again, I would argue, these numbers do a good job in highlighting players whose style of play is centered upon low-percentage, long-range efforts on goal.
In fact, one player who was invoked time-and-time again during discussions on this matter was Christian Eriksen – whose chance conversion rate was among the worst of all high-usage players last season. In his enlightening article, Spreadsheet made the point that Eriksen’s chance-conversion rate was due to regress towards the mean this time around. While distinctly possible, I would argue that the numbers suggest otherwise: we now have three successive seasons of data which demonstrate that Eriksen is a low-percentage chance converter. In other words, a low chance conversion rate is already his baseline projected performance.
What Does It All Mean?
One elusive point which I think often gets lost within conversations on statistical projection – whether here or elsewhere – is the practical application of these numbers in the process of decision making. I will be the first to admit that statistical projections are a loose, vague, and imprecise measure of a player’s footballing (and more importantly, to us at least, fantasy) ability. There will always be players who come from left field to impress; just as there will always be players whose sexy underlying numbers will forever flatter to deceive (here’s to you, Dusan Tadic – I’ll see you in my team come Gameweek 1).
Yet there are equally cases where statistical underperformance has enabled fantasy managers to jump early upon some emerging stars: who could forget our statistical crushes on Riyad Mahrez and Sadio Mané heading into 2015/16?
Scepticism of anybody spouting off endless reams of statistical analysis to support or defend a player is healthy.
As Individual and I have often joked, numbers are a very flexible concept. To illustrate this point, and to round out this article, I’d like to offer you a choice:
Player A had an incredible year last season: finishing in the top-10 midfielders in terms of assists and BPS. Furthermore, he almost doubled his goal output from the previous year, a product of generating more goal attempts and big chances than any other season in his career. Still only 21 years of age, and having risen in price by a full 1.0 this time around, this prospect could prove to be a generational cornerstone for both club and country.
For Player B, meanwhile, things aren’t looking so bright. Despite playing in an advanced midfield role, his number of assists and total chances created both fell relative to 2015/16. He averaged fewer touches per minute than any previous year. Worse still, other midfielders around him seem to have picked up the slack: two of his teammates finished in the top 10 midfielders for fantasy points last year. His potential for explosive returns, particularly early in the season, may be muted: despite a high opening price of 8.5 last season, he produced only 1 double-digit return across the first 17 gameweeks. Nor did he end the season well – producing three blanks and only a single bonus point across the last six games.
Player A, as you might have guessed, is rising superstar Dele Alli.
As for Player B? Also Dele Alli.
Numbers can be misleading.
The choice of how much you choose to believe in the power of statistical measures as a means of projecting player performance is ultimately each manager’s personal choice.
There will be many, no doubt, whose minds jump upon reading such analyses to the words of Mark Twain: “there are lies, damn lies, and statistics”. In many cases, such people are probably right. Yet I prefer a different mantra, courtesy of statistician George Box: “all models are wrong – but some models are useful.”
Thank you once again to everybody who has engaged with these discussions over the past couple of weeks. Though I’ve tried to pull together the questions and conversation points that have been most prominently discussed of late, this list is by no means exhaustive or definitive. I’ve enjoyed the discussions immensely – and hope to find myself sifting through tables of data for many months to come!
6 years, 9 months ago
Thanks for this- another fascinating insight.
Also we are delighted to say that Prokoptas is joining our team of writers, to pen more detailed statistical analysis - starting with a number crunching review of the Community Shield.