Community

Expected goals part two: the pros and cons for FPL

Leading on from “Expected Goals, Part One“, I would like to touch on the pros and cons of xG and its use in FPL. Hopefully it produces some discussion within the community and can help give a better perspective in some areas.

xG as a predictive model

Taking all of the information from Part One, what we get is a method of showing the sum total of shots and their quality in a given period (one match, series of matches, etc). For example, if a team had an xG rating of 1.89 from all of their shots in a particular football match but only scored one goal, we can conclude that they underperformed in terms of their xG.

If the xG showed 1.02, then we can conclude it was a somewhat fair representation of the chances they were presented with. If they scored one goal from a value of 0.25 then we can conclude that they overperformed in relation to their chance(s). So can we take these findings from the xG calculator and use it to try and predict future outcomes? In essence, no. We cannot use an xG calculator as a predictive tool on its own and here is why.

As a predictive model, many have tried to show correlations of Expected Goals per 90 minutes (xG/90) and Actual Goals per 90 minutes (aG/90), using small data points and small sample sizes in their findings. Through a statistical concept known as the Linear Regression Model, it attempts to quantify the findings almost in terms of binary (1 being absolute perfection and 0 being useless) using a process called R-squared.

The data points used in many xG regression models are so small (maybe 10-60 data points) that once again, large inaccuracies can be given – from a statistical point anyway. So what does this mean? Can we use xG as a predictive model at all? Well not without first looking at factors outside of the realms of xG.

The xG calculator can only portray what is happening. In isolation, it is not a predictive tool. The calculator is not saying “Liverpool scored one goal from an xG of 4.5, hence they underperformed and will correct this next week by scoring four goals”. These kind of assumptions are made by FPL managers, not by xG calculators. If we really wish to try and use xG as a predictive tool, then we need to first abandon the xG calculator and look at a wider range of factors in play.

One particular factor to examine is the nature of the shots. How many shots were there in a 1-0 game with a rating of 0.90 xG? Perhaps they were 10 mediocre shots with an xG of 0.09 and one happened to go in. Perhaps they were two close-range chances of 0.45 and one went in. Context and clarity are essential when trying to work out a predictive model, otherwise it becomes very misleading and problematic to us who wish to try and apply it to FPL (particularly when picking players).

There are many other factors at play if we perform some meta-analysis of an xG graph. A player may be over-performing in terms of xG. For example, Harry Kane may have 30 goals in a season when his xG says he should have 25 goals based on the quality of his shots. Is this good or bad? Well it may be a case of a high conversion rate that didn’t regress to an expected mean. This is where the metric of ‘shots on target’ comes into play to help us get a better understanding of individual players’ performance and conversion rates. We can look at historical examples of a particular player’s conversion rates and whether he tends to overperform, underperform or return close to the mean. In doing so, we can gain a better idea of sustainability by looking at past conversion rates from certain players and using the data it in conjunction with what the xG calculator is telling us.

Think of it like this. We mentioned earlier that a freekick has a conversion rate between 0.05 and 0.06 (5-6%) on an xG calculator. Let us assume a 30-yard freekick yields a 5% conversion rate, or 1 in 20 chance. Yet, over the long term, I would expect Lionel Messi to perform better on 100 of these freekicks than “John” from Sunday morning 5-aside. So player quality has a huge impact; both the person taking the shot and the person between the posts attempting to save it. The xG calculator can only do its job and give us the figures based on the information it receives. If this computer cannot be used as a predictive model, then we must consult another computer to do so; the human brain.

Again, going past the realms of xG, other factors needed to be taken into account including injuries, change in personnel, additional competitions, change in morale, rotation. An xG calculator tells us what has happened in a match in terms of shot quality and the chance of converting those shots. If Man City produce an xG of 4.2 in a given game, and a key player like Kevin de Bruyne picks up an injury midweek, it will likely have a huge impact on the amount of chances created in the next game based on history. If we wish to attempt to use xG as part of a predictive model in FPL, we need to go beyond isolated xG and look at external variables that the calculator cannot (and should not have to) take into account.

For example, if we look at a team over their last six games and notice they had a high xG but low conversion rate during a tough run of fixtures, can we expect more goals in the future when their schedule improves? This is not a question that the xG calculator can answer, hence we cannot use it to predict what might happen. The xG calculator tells us what was happening without the context of 1) the playing style during this period and 2) the opposition. What we need to do as FPL managers is to go beyond this and look at the reasons why something happened or did not happen as depicted by the xG graph.

Perhaps the team played a counter-attacking game against bigger sides and suffered when playing more openly against teams of a similar stature, e.g. Wolverhampton last season. Or vice versa. As mentioned in the first paragraph of this section, we need to know a variety of other factors – the context of the game. Did a defender slip and allow a huge chance that was saved? Is it likely an error of this calibre happens again? Was there a goal that was wrongly given due to an undetected offside? Were there countless shots from outside the box due to a stubborn, deep-lying defence? Perhaps the opposition were winning 3-0 and sat back for the end of the game, allowing chances to be presented to the opposition for a period? Context is vital.

The problem with rebound goals

Somewhat linked to the point above, a fellow FPL-enthusiast (Joe Greenwood, aka Scoredlario) raised this issue that is a stumbling block in the xG calculator. Rebound goals.

As a side note, I discussed this next section with Will Timbers, a.k.a. TopMarx and had a great conversation that revealed a lot about xG in general to me. What I would like to say is that this next part depends on what you define xG as. If you view “Expected Goals” as a means of telling the tale of a football match and how many goals could have realistically been scored, then you might agree with me. If however, you view it as a means of mapping the shot quality of all of the shots in a football match, then you will probably side with TopMarx.

So, if we take a scenario where a player is one on one with the goalkeeper. The attacker shoots a few yards out and the shot is saved by the keeper and parried away. Another attacker runs onto the rebound and scores. In essence, the calculator is measuring the xG from the first attempt (that was saved) and is combining it with the xG of the second attempt (that resulted in the goal). It treats both actions of that phase as separate shots, but does not take into account that the second shot would not exist if the first attempt had been converted.

Basically, the problem here is that we have a combined xG of greater than 1.0 from a single phase of play. The one on one attempt might be for example 0.6, and the rebound might be 0.8 (sum 1.4 xG). This is impossible in reality, because we can’t score more than one goal from a single phase of play. The rebounded shot fails to exist if the first attempt is converted in the first place. In this situation, perhaps calculating the odds of missing both chances would be a better way of looking at it, rather than adding the probability of scoring both chances onto an xG graph.

Let us expand a little more on the previous example. Imagine a scenario where a big chance is saved by a goalkeeper, and two rebounded saves occur. Or a defender manages to make a goal-line clearance after the initial save. On an xG chart, from that single phase, a team might be represented as having an xG of 2.0 or higher from those chances when the latter was dependant on the former. If the first shot is scored, the second attempt doesn’t exist, ergo the third attempt doesn’t exist. From the xG graph, the team’s expected goals is misrepresented by the data in this instance. It follows, that we can’t say more than one goal should have been scored from that phase of play, only that the defending team were extremely lucky not to concede. Measuring rebounds as part of a metric like shots in the box is perfectly fine, but as part of an xG graph it is extremely problematic in my opinion. This is why we should always scrutinise and analyse the data further.

Conclusion

As we can see, xG is a tool by which we can see the quality of shots being taken in a match, and their probability of being scored. But as mentioned, we must first have a basic understanding of it and how it is calculated if we are to use it effectively. Just like a very detailed picture of a car depicts the car, it doesn’t explain to us how a car works. Similarly, xG shows us the detailed picture of what is happening in terms of shots and shot quality, but not necessarily the context of the match.

We as FPL managers have to go beyond this if we are to ever try and use xG as part of a predictive model. By incorporating the external variables into what we find from the xG tool, we might stand a chance of doing this.

This ‘meta-analysis’ ultimately allows us to make assumptions via that we cannot otherwise make from using the xG tool in isolation. In doing so, it may give us an edge in FPL over the long-term.

Bøwstring The Carp Active since 2011 on FFS. Occasional poster and community article writer. Twitter: @MattKearney92 Follow them on Twitter

15 Comments Post a Comment
  1. FPL Virgin
    • Fantasy Football Scout Member
    • 7 Years
    4 years, 9 months ago

    Thanks for this - an excellent, incredibly detailed article.

    Have you listened to this at all?

    http://fmlfpl.libsyn.com/ep-161-fireside-chat-with-michael-caley

  2. Bøwstring The Carp
    • 12 Years
    4 years, 9 months ago

    Cheers, will give it a listen to

    1. FPL Virgin
      • Fantasy Football Scout Member
      • 7 Years
      4 years, 9 months ago

      Well worth a listen.

  3. RedLightning
    • Fantasy Football Scout Member
    • Has Moderation Rights
    • 13 Years
    4 years, 9 months ago

    Interesting that adding the xGs from two shots during the same phase of play does not give a realistic combined xG for that phase of play.
    Combining probabilities of single events to get a meaningful probability of a combination of events is not that simple!
    Having multiple scoring chances during the same phase of play only happens when all but possibly the last of these chances have been missed.

    If the original shot had an xG of 0.6 and the rebound shot an xG of 0.8, but the rebound opportunity would not have happened if the first shot had gone in, then the combined xG for this pair of shots could perhaps be calculated as 0.6 + (1.0-0.6)*0.8 = 0.92.

    But what actually constitutes a phase of play?
    I suggest that a new phase begins whenever a team gains undisputed possession of the ball, and continues until the other team gains undisputed possession or till the end of the half. Until then, anything that happens during that phase is at least partially dependent on what happened earlier.

    1. Flaming Flamingo
      • 7 Years
      4 years, 9 months ago

      That's a really interesting question. With regards to VAR at least, I believe a new phase of play is when the defensive team has been given the opportunity to reorganise (or something to that effect). Though I think the definition serves a different purpose for VAR than it does for xG.

  4. Piggs Boson
    • 12 Years
    4 years, 9 months ago

    Thanks Bow, phenomenal pair of articles 🙂

    I will now treat xG as a way to highlight players to further research. Context is important.

    I can see both sides to the rebound argument. The problem is that actions in a football match are all interwoven and connected, and the rebound problem is a result of treating each shot as an separate event. I believe the initial shot's xG should take into account the chance of a rebound happening, and the chance of that rebound going in, all in one event, rather than two separate shots. Complex problem.

  5. Markus
    • 14 Years
    4 years, 9 months ago

    This is a brilliant article, thank you for posting, and helps to demystify the black box of xG! Just one question/challenge though - a lot of the cons in the article could be leveled at actual goals scored, or even the more detailed stats like shots on target - are they replicable, what else changes, is it contextual just to that opposition/match? The question for me is which is the best (albeit inadequate) predictor of future returns? I might be wrong, but I felt some analysis had been done that suggested that expected goals was a better predictor of future results than actual goals but I might have imagined it. For me, I've found a combination of actual and expected goals gives a decent ballpark in a game that is highly unpredictable anyway (eg man utd over achieving clean sheets in 2017/18). At an individual level, much more detailed understanding is clearly needed to explain under/over performance as you say, as a Mr Benteke's xgs figures will attest to!

  6. POLSKA GOLA
    • 10 Years
    4 years, 9 months ago

    My take on xG is that it may not be the best way to assess the whole team but I find it very informative for assessing players. For example the rebound problem doesn’t concern me at all in terms of assessing individual players. If a player happens to be at the right place at the right time to anticipate the rebound to me it is equal with a clear cut change from normal play.

  7. Prøphet
    • 8 Years
    4 years, 9 months ago

    Pffff Scored. I see what you are doing Bow. Scored is an Edison to my Tesla. Stealing my work.

  8. VFORVANDETA
    • 7 Years
    4 years, 9 months ago

    Thanks

  9. mo 10 years on FFS? Join my…
    • Fantasy Football Scout Member
    • 14 Years
    4 years, 9 months ago

    Loved reading this. Has anyone done the analysis between the different predictors that are available? Do we know sample sizes (for example) of the different xg predictors?