634x258 Community Submissions
21 January 2016 48 comments
rakkhi rakkhi
Share:

Earlier this season I took a look at the predictive value of different statistics in Fantasy Football.  My first article on the subject can be found here. But so far a lot of the discussion, such as Balders flagging up of 11tengen11’s analysis on the predictive value of stats,  has focused on teams rather than specific players. For this latest article I thought I would look at how statistics can be used to predict returns for players, specifically forwards and midfielders that have played a minimum of 270 minutes for the past four complete seasons (2011/12 to 14/15)

Total stats

Firstly we are usually interested in predicting goals for the next six Gameweeks using the data we have at the current point in time:

pic 1

Most of the statistics provide a 40-50% predictor of the player goals for the next six Gameweeks. But this is bad news for stat-heads and good news for casuals because the best indicator of goalscoring remains goals scored already, rather than underlying statistics such as shots on target.

Last 4 Gameweeks data

How about the more recent data?  Are the last four Gameweeks a better indicator of form and therefore a better predictor?

pic 2

Actually this is where stat heads may be able to steal a march over casuals, as the last four Gameweeks’ statistics such as shots on target and big chances is a surprisingly good predictor for player goals for the next six Gameweeks. But it is still not better than goals scored.

Combining stats

How about combining the stats? Can using a variety of statistics together prove a better predictor?

pic 3

Building a complex mode with high predictive power seems quite difficult, using a technique called multiple linear regression (https://en.wikipedia.org/wiki/General_linear_model) which enables you to relate multiple independent variables to a single dependant variable. In this case I used: shots on target, shots, shots in the box, big chances and penalty area touches to attempt to predict goals scored in the next six Gameweeks and also the rest of the season.

As you can see unfortunately considering multiple factors may actually lead to a worse prediction than simply a single statistic.

Fixtures

How about using tried and true fixtures? Intuitively it seems like they have not worked so well this season but experienced players will tell you that form usually follows fixtures. In Gameweek’s 23 and 24 Everton striker Romelu Lukaku is predicted to score goals at home to Swansea and Newcastle, even though he goes into the fixture on the back of a poor run of form. But are Lukaku backers right to anticipate a points haul?

pic 4

It seems not. In fact it is quite frankly amazing how bad fixtures are for predicting goals. I used the end of season shots in the box conceded as a proxy here for fixture difficulty. The results are clear: as a general rule fixtures are terrible at predicting goals, much worse than simply past goals scored for a player. However, for the players shown on the right it is worth getting them in or keeping them or captaining for a particularly easy set of fixtures.

Assists

How about assists? It feels like Mesut Ozil and Dimitri Payet have encouraged a return to old school midfielders for assists rather than goals type of play.

pic 5

Assists are harder to predict than goals but here at least the manager who looks at chances created has an edge over simply previous assists scored in the season.

Putting the stats to the test in Gameweek 23

A good test perhaps is the next set of fixtures. Taking goals scored over the last four weeks Jermaine Defoe with five goals, Wayne Rooney on four and Sergio Aguero and Patrick van Aanholt with three apiece are the form strikers. If goals scored is the best predictor of more goals then we should expect one or more of these strikers to do well in Gameweek 23.

Taking big chances created  over the last four weeks as an indicator Rooney (4), Aguero (3) and Defoe (3) also do well, however this form of data also adds West Ham’s Michail Antonio (4) and Norwich’s Dieurmerci Mbkani (4) to the list. Harry Kane, Olivier Giroud and Georginio Wijnaldum on three big chances each are also predicted to do well according to this form of data.

If Antonio, Mbkani, Kane, Wijnaldum and Giroud do well in Gameweek 23 then perhaps big chances should be given more credence as a predictor.

Finally, let’s take a look at the form players in terms of shots on targets over the last four Gameweeks. Once again Defoe (8) and Rooney (7) do well. But also riding high in this form of data are Newcastle’s Alexsandar Mitrovic (8) and Wjnaldum (7) as well as Kane (7). If the Magpies duo and Kane prosper this Gameweek then perhaps it is time to consider shots on target as a bigger predictor of points.

Conclusion

Hopefully this doesn’t discourage you from looking at the statistics. In fact the aim really is to consider more complex models than the simple stats tables and to test those models for their predictive value against historical data. If you want to save some time though don’t feel guilty about simply relying on the top goals scorers for the season. All eyes will be on Rooney, Defoe, Van Aanholt and Aguero in particular in Gameweek 23.

Extra graphs

Some graphs for those who are interested:

Total stats predictive value for the rest of season at a given Gameweek: graph

Per 90 stats for predicting player goals for the next 6 Gameweeks: graph

Per 90 stats for predicting player goals for the rest of the season: graph

rakkhi Love my football, love my stats, hoping to improve each year. Go the gunners! @rakkhis on Twitter Follow them on Twitter

48 Comments Login to Post a Comment
  1. J0E
    • Fantasy Football Scout Member
    • Has Moderation Rights
    • 16 Years
    9 years, 11 months ago

    thanks for this. I usually use big chances and shots on target as a guide and looking at recent goal scoring form there are similarities between those and actual goals scored as predictors.

    Think I may place a bit more weight on goals scored in combination with shots on target and big chances though and see how that goes.

    From the above that would give Rooney, Defoe as the only ones to be riding high in chances, shots on target and scoring lots over the previous four game weeks.

    Makes me wonder about Lukaku as a captaincy option.

    1. fancy111
      • 10 Years
      9 years, 11 months ago

      I saw a stats chart earlier that has changed me to Aguero... West Ham's defense has conceded many many chances and Aguero is the player to take advantage of chances.

    2. rakkhi
      • Fantasy Football Scout Member
      • 15 Years
      9 years, 11 months ago

      Thanks Jonty for posting and adding the game. If you don't mind adding a small caviet that you can't really judge the longer term stats (4 years) on one game week for the test. Also that you should really look at the next 6 game weeks goals based on the last 4 game week goals scored if you want to test this overall predictive value holds for this specific period of time

    3. TRIPPIER'S DAD
      • Fantasy Football Scout Member
      • 13 Years
      9 years, 11 months ago

      If Funes Mori doesn't steal Lukaku's goal last week though all of a sudden Lukaku has 2 in four and that doesn't seem like bad form to me

    4. rakkhi
      • Fantasy Football Scout Member
      • 15 Years
      9 years, 11 months ago

      Jonty I also looked at Goals imminent for you. Tried to calculate regression to the mean type stat using cumulative or total shots on target minus cumulative goals scored as a predictor of next 6 GW goals. Unfortunately that was not more predictive than simply using total shots on target but worth seeing if an effective model could be developed

      https://drive.google.com/file/d/0BwgiFO7urk_gOFUzcm15ZjRSWXc/view?usp=sharing

  2. Debauchy
    • 12 Years
    9 years, 11 months ago

    Just landed on this moon ? Time to launch out

    1. Zasa
      • 11 Years
      9 years, 11 months ago

      Eh?

  3. Eden Hazardous
    • 11 Years
    9 years, 11 months ago

    nicely done! 😀

  4. Danno - Emre Canada
    • 10 Years
    9 years, 11 months ago

    Beautiful. I live near NPL and am popping this in for one of their boffins to confirm. Top work.

  5. Doolittle
    • 13 Years
    9 years, 11 months ago

    Doosra will hate this article 😉

    1. fancy111
      • 10 Years
      9 years, 11 months ago

      lol, too true

    • 12 Years
    9 years, 11 months ago

    Huge step up, great stuff Rakkhi!

    also @Jonty, can you sort the misspelling of my name in the article please?

    1. Eden Hazardous
      • 11 Years
      9 years, 11 months ago

      Ha! 😆

    2. GreenWindmill
      • Fantasy Football Scout Member
      • 14 Years
      9 years, 11 months ago

      Vanity thy name is baulders 😉

      My name is always mistyped in articles, either with a space, no capital W or both. It's GreenWindmill damnit!

    3. J0E
      • Fantasy Football Scout Member
      • Has Moderation Rights
      • 16 Years
      9 years, 11 months ago

      Sorted. Actually thought it may have been someone called Baulders so didn't correct him at the time. 🙂

        • 12 Years
        9 years, 11 months ago

        Thanks Jonty 🙂

      1. rakkhi
        • Fantasy Football Scout Member
        • 15 Years
        9 years, 11 months ago

        Sorry Balders 🙂

  6. GreenWindmill
    • Fantasy Football Scout Member
    • 14 Years
    9 years, 11 months ago

    I'm not entirely surprised by the fixtures result because if the sample pool of players is simply "forwards and midfielders that have played a minimum 270 minutes for the past four complete seasons (2011/12 to 14/15)" then I assume we're including a lot more Tiotes and Barrys than Agueros and Lukakus.

    1. rakkhi
      • Fantasy Football Scout Member
      • 15 Years
      9 years, 11 months ago

      Yes but I have also previously looked at players of fantasy interest only and the overall correlation is still low except for specific players: http://www.fantasyfootballscout.co.uk/2015/12/18/fixture-proof-v-flat-track-bullies/

      1. rakkhi
        • Fantasy Football Scout Member
        • 15 Years
        9 years, 11 months ago

        Also ran it filtering just by high scorerers defined as players who scored 10 or more goals a season. The sample size is a lot smaller so the data is a lot more variable but fixtures at least as defined by teams who have conceded more shots in the box do not seem to be a great overall correlation. I'll keep working with fixtures data as it has been an axiom for so long in Fantasy Football to see if there is a way of measuring fixtures so that they are a good predictor but so far I have not found it.

        https://drive.google.com/file/d/0BwgiFO7urk_gaklKUjZkYVA4T00/view?usp=sharing

        1. genesolv
          • Fantasy Football Scout Member
          • 10 Years
          9 years, 11 months ago

          Awesome stuff. I model fixture difficulty using a home away measure factor and a strength of opposition factor. Both are based on points scored. Home Away looks at how many points each team scores home vs away over the last 38 game weeks. I generate one factor per position type. Strength of opposition looks at the ratio of points scored against a team compared with all other teams. Again done over 38 game weeks and for each position.

          So my predicted player score is average player points over 10 weeks * home away factor * strength of opposition factor.

          My team with that algorithm is currently ranked about 350k. Not too bad.

          Anyway that's a long way of saying fixtures by themselves are not good predictors of points. But they it seems they can be good modifiers of other player performance measures.

        2. Kessler
          • Fantasy Football Scout Member
          • 12 Years
          9 years, 11 months ago

          Rakhi, excellent article! Although had the armband on Luka all week but u made me switch to Aguero!! Hope the stats don't let me down!! 🙂

          1. rakkhi
            • Fantasy Football Scout Member
            • 15 Years
            9 years, 11 months ago

            Haha hope so too!

  7. St. Joseph
    • 10 Years
    9 years, 11 months ago

    Nice article
    Armband set on kun(c)

    1. J0E
      • Fantasy Football Scout Member
      • Has Moderation Rights
      • 16 Years
      9 years, 11 months ago

      I'm thinking that too. Main thing I got from this article was how Rooney or Defoe look great, albeit unlikely, bets for the captaincy this week.

      1. St. Joseph
        • 10 Years
        9 years, 11 months ago

        Have been thinking lukaku to rooney though..
        I wish i have had more FTs...

  8. Woy of the Wovers
    • 15 Years
    9 years, 11 months ago

    I will enjoy looking into this, depressing as it is.

    1. Woy of the Wovers
      • 15 Years
      9 years, 11 months ago

      I wonder if the problem here is that you're trying to predict Goals in next 6 GW. As we know, small numbers are always a problem in stats so if you're looking at correlations between two sets, it will always be quite weak if one of the numbers is small.

  9. Holy See
    • 15 Years
    9 years, 11 months ago

    Always feel a bit thick, takes me ages to go through an article like this. Like the 'goals scored is the best predictor of future goalscorer' bit, or that's how I understand it.. In the land of the casual, last week is king! Better get all casual I guess. You might be familiar with this chap, used to do some stat articles on ffs. Or statricles, if you will http://www.plfantasy.com/

  10. Diva
    • 11 Years
    9 years, 11 months ago

    Hi Rakkhi, interesting stuff, thanks for sharing.

    Sorry if I've missed something, but please can you explain how you calculate the percentages. For example, if I know that Agüero has scored three goals in the last four gameweeks. What precisely is the prediction arising from that knowledge for that has a ~40% chance of being correct? e.g. is it:
    That he will score 1 goal next week?
    That he has a 75% chance of scoring 1 goal next week?
    That he will score at least 1 goal in the next four weeks?
    Etc.

    Thanks!

    1. rakkhi
      • Fantasy Football Scout Member
      • 15 Years
      9 years, 11 months ago

      I'm performing Linear regression on various stats like total past goals scored per player and goals they score over the next 6 game weeks for example. Doing this for a large sample of over 1000 midfielders and forwards that have played over 270 mins (3 games) for the past 4 seasons. The R squared is basically how much the change in the first variable eg goals scored previously explains the other variable goals scored in next 6 game weeks. The 40% for example means it explained 40% of the change. So it is not exactly yes to any of your questions and it is an overall percentage not for an individual player.

      Hope that makes sense if not let me know and I'll try to clarify

  11. Zasa
    • 11 Years
    9 years, 11 months ago

    Very intriguing conclusions here Rakkhi, thanks for this!

  12. Triggy
    • Fantasy Football Scout Member
    • 15 Years
    9 years, 11 months ago

    I really appreciate this article. All to often online but even in the real, scientific community, negative (null) findings are not published. However, null results are often just as important, even if only to rule out (to a certain statistical likelihood) certain possibilities. This research doesn't state that stats are useless, although it does state that the most important stat is the king of the casuals, recent goals scored! It does suggest that other stats may be useful but that there isn't an obvious standout beyond big chances created.

    Many thanks again and very useful stuff!!!

  13. Camp No No
    • 12 Years
    9 years, 11 months ago

    Oh the annoyance. might explain why I'm doing so badly this season. Have been going mainly by fixtures. 🙁

    Did I learn anything, though? *Transfers Lukaku in*

    1. rakkhi
      • Fantasy Football Scout Member
      • 15 Years
      9 years, 11 months ago

      Yeah maybe avoid fixtures except for very specific players. If it is any consolation at least when I looked at in GW14 Lukaku was a massive flat track bully this season. Past the Swansea game (who are 8th best/12th worst) you have Newcastle, Stoke and West Brom who all have fairly poor shots in the box conceded with Newcastle being the worst of the lot

      https://drive.google.com/file/d/0BwgiFO7urk_gTGhTeXhscWdHTEU/view?usp=sharing

    2. rakkhi
      • Fantasy Football Scout Member
      • 15 Years
      9 years, 11 months ago

      You got me curious to update it to current. Lukaku still looks good for next few games: https://twitter.com/rakkhis/status/690531324973584384

    3. Camp No No
      • 12 Years
      9 years, 11 months ago

      Hey, Rakkhi, this was actually something I meant to return to sooner, but didn't find time. Since I've observed, although without a conclusive research, that just about every player who scores a lot, does score significantly more against weak opposition than strong one. That is: there are little differences between easy and hard fixtures for those who do not score much, but for those who end up at the top of the goal/assist table, the majority of goals they get involved in will always be in the easy fixtures. So that every top scorer from Aguero to Messi and from Lukaku to Ronaldo is a "flat track bully". This is completely natural: no team and no player consistently scores a lot of goals against the strongest opponents, therefore, to score a lot of goals, a player basically has to get some big hauls against weaker opposition. E.g. Aguero and Suarez tend to score a bit everywhere, you wouldn't define them as flat track bullies, but when you look at it, they tend to get the massive hauls against weakest oppositions. And yes, when I last looked at it, Lukaku was very predictable in this way.

      1. Camp No No
        • 12 Years
        9 years, 11 months ago

        Hmmm... looks like Giroud and Özil with their avoidance of big hauls and consistency of goals/assists are great exceptions to this rule as players who don't score more against weak teams.

        1. rakkhi
          • Fantasy Football Scout Member
          • 15 Years
          9 years, 11 months ago

          Yes exactly they are good examples

      2. rakkhi
        • Fantasy Football Scout Member
        • 15 Years
        9 years, 11 months ago

        That is true for this season but has not held for a larger sample when I looked at players across a few seasons. Some players definitely as you say like Hazard and Costa score a lot more against weaker teams but Kane has scored a lot against everyone

        1. Camp No No
          • 12 Years
          9 years, 11 months ago

          With the Kane the sample size is extremely small with fluky results like against Chelsea.

          You look at anybody, who has scored A LOT in a long run, and there should be a bias on goals against the weak teams. It's quite simply the fact that the top teams very rarely (usually only in occasion of having a lot of injured players or taking an early red card, in which case they no longer qualify as top teams, for that match) concede a lot of goals, and particularly the hat tricks and other big hauls are almost exclusively scored against the bottom table clubs.

          Özil or Giroud out before tough fixtures? Seems stupid but I think I'm going to.

          1. rakkhi
            • Fantasy Football Scout Member
            • 15 Years
            9 years, 11 months ago

            Well I looked at 2.5 seasons of data here and there are enough players that have scored well that have a negative or low correlation with easy fixtures that I would say that is not the case.

            https://drive.google.com/file/d/0BwgiFO7urk_gUElyWjVCNmFwQ00/view?usp=sharing

  14. FPL P0ker PlAyer
    • 11 Years
    9 years, 11 months ago

    Awesome post and sobering stuff that leads me questioning the value of my stats based approach.
    I harbour a strong suspicion, however, that the xG model would trump both of those under consideration here. If only I had some good contacts in that field...

    1. rakkhi
      • Fantasy Football Scout Member
      • 15 Years
      9 years, 11 months ago

      I expect it would come a bit above shots on target based on the end of season correlations but I would love 4 years of data to test it also. I have been collecting the data this year since Paul Riley started publishing it so maybe in a few seasons 🙂

  15. Demí
    • 15 Years
    9 years, 11 months ago

    Speechless. Great, great article

  16. Eden Hazardous
    • 11 Years
    9 years, 11 months ago

    Goals scored

    PVA Check
    Aguero Check

    Rooney and defoe fail

    50% success rate , not bad

  17. Gloria Kanchelskis
    • Fantasy Football Scout Member
    • 12 Years
    9 years, 11 months ago

    Wish I'd read this before the weekend - it should of put me off captaining Lukaku