One of the main reasons for designing my data for xGextraliga project was to identify which metrics matter more than others in the game of hockey. This article selects six main metrics, compares them to each other by calculating correlations to winning (more accurately to goal differentials from games). These include:
– unblocked shot attempts (also known as „fenwick“)
– score-adjusted unblocked shot attempts
– screened unblocked shot attempts (only offensive player screen considered)
– expected goals (based on past researches in the NHL, this will be discussed in a separate article)
– clear path opportunities (= shooter is positioned in the slot area with no block close to him)
– odd man rush opportunities (= 3 on 2, 3 on 1, 2 on 1, 2 on 0, 1 on 0 rushes)
Below you find a visualization of correlations (r2) for described metrics and goal differentials with an increasing sample size. To be 100% clear I took differentials of a metric from a perspective of a home team and a final score from a game, eg. 38:30 fenwick and 2:4 final score equal to +8 fenwick differential and -2 goal differential.
All metrics were expected to have a positive impact on a game. That means if your team wins the „metric“ duel it is more likely to win the game. For sure some metrics were expected to have a lesser impact and it also showed in results. Let´s comment on selected metrics:
Screened unblocked shot attempts (r2 = +0.04) were unsurprisingly the least correlated metric. The reason behind is simply low frequency of these events. There are only around 5.5 unblocked shot attempts with an offensive screen per game on average. Also while the screen increases your odds of scoring the value is still much lower compared to other dangerous opportunities (such as odd man rushes or clear paths).
Unblocked shot attempts (r2 = +0.23) had a low correlation to winning on our 75-game sample. Reasons are described more under „score-adjusted fenwick“ and „expected goals“.
Score-adjusted shot attempts (r2 = +0.43) has a solid but not overly high correlation to winning. Circa 43% of game results can be explained by this metric. If we adjust for a score (teams with a lead tend to go into a defensive shell) the correlation increases significantly though. And so again it is more than recommended to adjust for a score and to research for which team spent more time in a lead than the other.
Odd man rushes (r2 = +0.47) are similarly to screened shots low frequency events in the game (only around 3 per game on average in our sample). Still combination of their danger (high xG value) and score effect (teams in a lead have more odd man rush opportunities) makes them a solid metrics to be interested in.
Clear path opportunities (r2 = +0.56) have a significant correlation to winning. There were around 13 of them recorded per game on average and teams that created more of them than their opponent tend to win more games. An important metric to track. If we combine both odd man rushes and clear paths and create a new variable called „dangerous possessions“ the correlation with a goal differential increases to a high value of +0.64! That is only 16 events per game that explain 64% of game results, impressive!
Expected goals (r2 = +0.66) is a metric with the highest correlatin of all tested ones. It improves simple shot metrics significantly by expressing quality of a shot attempts with a various xG vales. 66% of results explained by it might not look that significant but remember we are dealing with a game to game goal differentials. Hockey is still a sport where a luck play its role and amount of goals scored in each game is limited. If we look for a team by team correlation in 10-game samples the correlation of goal differential and expected goals is +0.88. That is a huge value. You can´t cheat your expected goal output long-term!
Lastly I want to comment on data sample. Comparing correlations of metrics within first 10 games is a pure chaos as displayed in the left part of the graph. Precision gets better after 20 games and smoothens much more after 45 game-mark. This is a nice finding that tells us how much data we need in order to have a fine/good represenation of a reality.
– Expected goals describe the strength balance among teams the best.
– Dangerous possessions are very powerfull metric to track. Limited time? No xG model? No problem. Track dangerous possessions for teams to reveal their strengths.
– Shot attempt based metrics bring more limited value to the discussion. If we adjust for a score we can still get a solid description of a game. As proved in the past score-adjusted shot attempt metrics still have a fine predicitive power.
– Sample size of at least 20 games is recommended to start searching for metric significance and results interpretating with a confidence.
1 komentář: „#xGextraliga: How much do different metrics matter in hockey?“