Community

One statistic to rule them all

One Stat to Rule them All? A look into the relevancy of Expected Goals, Big Chances and Shots Inside the Box

The Fantasy Football Community has grown manifold in the last few seasons, FPL is not a casual game anymore with so many managers playing the game actively. There is a lot of information available to people playing the game and managers are using the core numbers every week, but are they using the numbers that yield accurate results? 

I tried to look into various key stats such as Big Chances, Shots Inside the Box and Expected Goals (which has recently become very popular amongst the Fantasy Football Community), and tried to correlate them with the ultimate output, that is, Goals Scored. 

I’ve used last season’s (2018/19) data and the season so far (Gameweek 22 of 2019/20) and prepared standalone Tables to find out the Correlation with Goals Scored. 

Correlation is a statistic that measures the degree to which two variables move in relation to each other. The value falls between -1.0 and +1.0. A perfect positive correlation means that the correlation coefficient is exactly 1.

Using Pearson’s Correlation (For more information visit here), I’ve compared the 3 key stats with the Goals Scored, to find out which metric is the most relevant to predict future outcomes.

2018/19

The correlation factor (r) is very high for all 3 metrics for the whole season. However, Big Chances Created by the team seems more accurate as it has a higher factor (+0.981) than the other two metrics, although it’s very marginal (xG at +0.965 and SiTB at +0.973). 

2019/20

The correlation factor (r) has dropped a bit from last season, with xG (+0.910) and BC (+0.905), although the factor for Shots Inside the Box have dropped by a fair bit (+0.856). 

Since it’s the whole league’s stats combined, this is a general outlook and does not look into every team’s stats individually. The Best teams tend to score more from lesser quality chances while the lower table teams score fewer from high quality chances. 

Conclusion

It looks like the fancy new toy named Expected Goals (xG) does come pretty close to predicting goals accurately, along with with Big Chances (BC), which also shows just as strong correlation with the Final Output. 

For someone who looks into Shots Inside the Box as the main key statistic to determine future viability, I am a bit surprised that xG and especially Big Chances win out in both seasons. Or maybe, it’s a bit underrated and the correlation factor will increase by the end of the season, and match the factor of last season? We’ll have to wait to find out.

Note: The xG numbers used in the article are derived from Opta. There are various different models online which may have different numbers.  

23 Comments Post a Comment
  1. Rotation's Alter Ego
    • Fantasy Football Scout Member
    • Has Moderation Rights
    • 8 Years
    1 month, 3 days ago

    Great research AK, thank you.

    I was having a play with this myself over the summer but couldn't get past the fact that minutes played and position played are massively influential underlying statistics. Simply looking at the data per team instead of by player is a much better way of doing it!

    1. AK ⭐
      • 7 Years
      1 month, 3 days ago

      My initial thought was also to do it Players wise, but the data only made sense when a lot of other factors were put on par, like as you said, Minutes played. Teams have played the same minutes apart from only Liverpool this season who have played 90 less, but I don't think it affects the data as much!

      Also, I found Spurs to be the "noise" of the data this season.

      Their stats aren't nowhere near as good as the output. So it's one to watch for the season. They're overperforming big time in terms of attack.

  2. AK ⭐
    • 7 Years
    1 month, 3 days ago

    For those interested, here are the workings which led to this article.

    https://docs.google.com/spreadsheets/d/1v1AJdFOQfsa2qF7L3oe4_ZUQmXkcsEAStfy_dk07MzE/edit?usp=sharing

  3. CMB
    • Fantasy Football Scout Member
    • 4 Years
    1 month, 3 days ago

    Interesting approach AK!
    Though I have to say that correlation is not the correct statistical tool in this case since your three predictors might have cofunding variables. Would rather conduct an ANCOVA

    1. AK ⭐
      • 7 Years
      1 month, 3 days ago

      Cheers CMB!

      Do you think ANCOVA can help with the players predictor? That would be a really good breakthrough, if we can get a predictor for players instead.

      1. CMB
        • Fantasy Football Scout Member
        • 4 Years
        1 month, 3 days ago

        Yes, for example when you want to put minutes on par, than you just put minutes played as a co-variable into the model. I think it is especially helpful for Players stats

        1. AK ⭐
          • 7 Years
          1 month, 3 days ago

          Perfect, I'll look into it.

          Are you on Twitter? Can hit you up when I go on to do a bit more research into this.

          1. AuFeld
            • Fantasy Football Scout Member
            • 3 Years
            1 month, 3 days ago

            Good stuff. Are you on github? I'd love to add on to this.

            1. AK ⭐
              • 7 Years
              1 month, 1 day ago

              I'm not on Github.

              And sure, you're free to add in whatever you want to. It was you who sought the access for the doc right?

  4. Lateriser 12
    • Fantasy Football Scout Member
    • 8 Years
    1 month, 3 days ago

    Very nice read this. Hoping to read some player specific research on the same matter.

    1. AK ⭐
      • 7 Years
      1 month, 3 days ago

      That's something I am looking into for the next part of this article, although it would be very difficult since there are so many attributes with are different in terms of players. Example, Minutes Played. These factors are always constant for teams so it's more relevant to teams.

      I'll look around it still though. If there's something that can be done for the players.

      1. Lateriser 12
        • Fantasy Football Scout Member
        • 8 Years
        1 month, 1 day ago

        Cheers and thanks again 🙂

  5. Fring
    • Fantasy Football Scout Member
    • 8 Years
    1 month, 3 days ago

    Nice read. Thanks AK.

    1. AK ⭐
      • 7 Years
      1 month, 3 days ago

      Cheers Fring!

  6. Stejson
    • Fantasy Football Scout Member
    1 month, 3 days ago

    If you like it then you wanna stick a GLiM on it (Generalised Linear Model).
    Does require the assumption that past performance is a good guide to the future, but most methods do

    1. AK ⭐
      • 7 Years
      1 month, 3 days ago

      I think stats of past years for teams are relevant as a whole. FOr a macro study on these things, you have to assume past performances are good indicators imo.

      I'll look into GLiM. Sounds good for now, will read up about it. Cheers, Stejson!

  7. Pep Pig
    • Fantasy Football Scout Member
    • 3 Years
    1 month, 3 days ago

    Interesting read AK thank you

    1. AK ⭐
      • 7 Years
      1 month, 1 day ago

      Cheers Pep Pig!

  8. Yank Revolution
    • 8 Years
    1 month, 3 days ago

    This is the cutting edge of FPL stat analysis, I love it....Keep it up, cats!

    1. AK ⭐
      • 7 Years
      1 month, 1 day ago

      Thanks Yank! I'llbe following it up very soon hopefully. Keep your tabs on it! 😉

  9. TopMarx - H2H L4 D5
    • Fantasy Football Scout Member
    • 7 Years
    1 month, 2 days ago

    Great stuff AK,

    Couple of thoughts, first on the subjectivity of Big Chances, quote from Michael Caley:

    "I think Opta coders do amazing work, and I know that their decisions are always double-checked as well. But it's hard for me to believe that there would be zero outcome bias, that a shot being scored wouldn't make the preceding chance look bigger."

    On Big Chances being incorporated into Opta xG models, quote from Mike Goodman:

    "Recognizing that nothing in the data set will distinguish particularly good chances from similar looking but mediocre ones, data collection leaned on creating a label that let everybody know “HEY! LOOK OVER HERE! THIS CHANCE WAS REALLY GOOD!” Tautologically, knowing that a chance was good helps xG models determine if a chance was good."

    So essentially Big Chances have an element of subjectivity and are used to improve xG models. So you'd expect them to correlate reasonably well to goals: "knowing that a chance was good helps xG models determine if a chance was good."

    1. AK ⭐
      • 7 Years
      1 month, 1 day ago

      So xG and Big Chances are more or less gonna have a very close relationship, yeah? It's interesting.

      I do remember very vaguely when xG was not into play as much yet, I looked at the season stats after 2016/17 and was very surprised by how much City were ahead of other teams in terms of Big Chances and Shots. It was something I took a mental note of and next season, City never stopped scoring!

      So Big Chances, even though they may have bias in them, are great indicators.