What is xG and how do we measure it?
Expected goals (or xG) is the measure of the quality of a shot, based on a variety of factors. Such factors include; the type of shot (head or foot), the angle of the shot, distance from the goal and the quality of the delivery. Through collecting all of this data from a football match, an xG calculator can determine the probability of a shot being converted into a goal. This presents us with a better representation of how many goals a team could have scored based on the sum total of these shots and the factors involved. An xG calculator is quite complex as it needs to run hundreds and thousands of examples and variables through its computer. So how is it measured?
Let us use penalties as an example. An xG calculator will collect data on every penalty taken from a set period (e.g.. the last 5 years) in a particular league, or a series of leagues. Then, it will look at what percentage of those penalties have been scored and missed to determine an average probability of the outcome. Let us say that 76% of penalties were scored in a sample of 1000 penalties, then the xG of scoring a penalty would be 0.76. In other words there is a 76% chance of scoring it and a 24% chance of not scoring it (saves and misses). Penalties generally are easy to calculate, but how about something more complex, like a really good chance inside the box during open play?
The information from this example is taken from a data engine like Opta. To calculate the odds of such a chance being scored, we must first take every attempt on goal inside the box from a given period, look at how many were scored and not scored, and then separate the big chances from the rest. A big chance is defined by Opta as “a situation where a player should reasonably be expected to score (usually in a one-on-one scenario or from very close range)”. Then penalties are subtracted since we want to measure the quality of a good chance from open play.
What we are left with are non-penalty big chances that were converted to goals. And Opta’s calculations show that roughly 38% of all non-penalty big chances were converted. Therefore, a big chance would yield 0.38 on a particular xG graph. As mentioned, this may vary slightly depending on variables used such as the sample duration (5 seasons, 10 seasons) and also the particular leagues included (Premier League, Bundesliga, La Liga, etc). Different leagues have different values assigned to their variables such as ‘average goals scored per match’, hence the leagues used will affect the calculation.
Going further with this, if we take all of the other non-big chances from inside the box, Opta tells us that there is a 7% chance of a goal, or 0.07 on an xG chart. And taking a step further from this, chances from outside the box yield a 3.6% conversion rate, or a 0.036 xG rating. In other words, a goal would be scored every 29 attempts on average from outside the box. If we isolate freekicks from these chances, then we can generally see an increase to the conversion rate (roughly 5-6% chance of conversion, or 0.05 to 0.06). Basically, what these calculations tell us is something simple in essence despite subtle variance in quantifying the shot; the closer the chance is to the goal, the more likely it will be scored.
There are a few things that are important to be aware of when looking at an xG calculator. Firstly, it is not a predictive tool. It is simply telling us what is happening, not what might happen. We as FPL players tend to try and use it as a predictive tool at times in order to pick players and anticipate returns. If we are to try and do this (or whether it can even be done effectively), we need to be aware of a few factors that might prove to be stumbling blocks.
The measure of xG
Since xG is a measurement of something, we should expect it to have a “standard unit” or reference point of some sort. Just like distance has the meter, time has the second, and so on, we should be able to quantify xG as best we can (in theory anyway). The best way to do this is to show the chances of a shot being converted to a goal as a percentage. For example, a dice has a 0.166666 (16.66666%) chance of landing on a particular side as it is a 6-sided cube. Likewise, we should expect the probability of a particular shot being converted to a goal to be quantifiable in a similar manner.
Now obviously, a six-sided cube being rolled, and a shot from a particular area on a football pitch seem vastly different in comparison. However, an xG tool allows us to quantify the odds of a particular shot being converted by using an enormous amount of variables and examples that have been inputted into the calculator.
If you look at different xG calculators, you may notice subtle changes in their values of a particular shot on one site compared to another. For example, a shot might be listed as 0.1 xG on one site and 0.12 xG on another site. Why is there a different value for the same shot? Well if they are the same values, they are probably using the same variables, i.e. the same time frame and leagues as their basis of their calculator. However if the values are different, we might be presented with certain inaccuracies (from a statistical point of view anyway).
The issue we face is that some sites may be using older data as the basis of their xG calculations and have failed to update it. They may be using a 5-year sample from 2010-2015, when the most recent calculators could be using a five year sample from 2014-2019 or a ten-year sample from 2009-2019. Now while we shouldn’t expect it to have a huge effect, it may be enough to change a particular shot from being considered 0.1 or 0.12 on an xG chart. While 0.02 of a difference doesn’t seem all that important, as (aspiring) statisticians we should treat it as a big problem in our quest to objectively quantifying a particular shot.
In summary, if you are using an xG calculator, be aware of the variables being used to calculate the shots. Furthermore, be aware of the sample size that is being used, the leagues being used, the time frame and so on.
Is xG an exact science? Not from a statistical standpoint, considering the situations we seek to quantify are susceptible to change, i.e. they are not fixed values. A 6-sided dice will always be a 16.6% chance of landing on a given side. A royal flush in Texas Hold’em will always have a 0.000154% chance of happening at the start of a hand. Will a shot 6-yards inside the box always yield the same odds of being converted from a particular cross? Not necessarily.
As xG calculators update, the value will change very slightly depending on how similar scenarios pan out in thousands of football matches in dozens of football leagues. Therefore, from a statistical point of view, is it objective? No. But is it still a good tool? Yes, absolutely; and I will hopefully explain why I believe so in the next part.