The main job of a NHL scout is trying to predict the future. A scout will use his experiences and scouting knowledge to give him the best educated guess to what the future holds for each player in the draft. It is the same way for a draft prognosticator, but instead of using live viewings, we use stats and scouting reports to come up with our predictions. In trying to make better predictions, I have taken on a project to see if I could come up with a better way of estimating offensive potential. To do this I decided to compare the correlation between points per game of a junior player in his 17-year-old season (the first year they become draft eligible) versus that of their best season in the NHL.
First off, I needed a sample of players to choose from and I decided to start with just looking at CHL players (OHL, WHL, QMJHL). Since most CHL players end up not playing in the NHL, I only wanted to choose players that played a significant amount of games in the NHL. For this, I chose players who have played 250+ games as my cut-off, which is the equivalent of just over 3 full seasons in the NHL. I also wanted to separate out power play points from even strength and shorthanded points and since I could not find any power play stats in the OHL past the 1997-1998 season, I chose that year as my starting point. For my ending season, I chose the 2005-2006 season, as I needed players that have started to hit their prime years. The players drafted in 2006 are 25-26 years old and have now entered their peak years of 24-32 years of age.
So using this criteria, we find that 126 CHL forwards had their first draft eligible season between 1998 and 2006 and they went on to play 250+ games in the NHL. In choosing their best point per game season in the NHL, I chose to use only seasons where they played in 40+ games. The players included are:
A. Tanguay, A. Vermette, A. Hemsky, A. Ladd, A. Stewart, B. Little, B. Marchand, B. McGinn, B. McGrattan, B. Pouliot, B. Radivojevic, B. Richardson, B. Ryan, B. Sutherby, B.J. Crombeen, B. Eager, B. Betts, B. Comeau, B. Gordon, B. Boyes, B. Richards, B. Dubinsky, B. Laich, B. Bickell, C. MacArthur, C. McCormick, C. Perry, C. Clutterbuck, C. Kelly, C. Stewart, C. Giroux, C. Armstrong, C. Fraser, D. Legwand, D. MacKenzie, D. Paille, D. Roy, D. Fritsche, D. Carcillo, D. Helm, D. Bolland, D. Boogaard, D. Brassard, D. Dorsett, D. Brown, E. Christensen, E. Fehr, E. Staal, J. Staal, G. Latendresse, G. Brule, G. Campbell, J. Lundmark, J. Lupul, J. McClement, J. Mitchell, J. Neal, J. Pominville, J. Sheppard, J. Spezza, J. Stoll, K. Versteeg, K. Wellwood, K. Janssen, K. Barch, K. Brodziak, K. Chipchura, M. Lombardi, M. Lucic, M. Malhotra, M. Ribeiro, M. Rupp, M. Ryder, M. Stajan, M. Talbot, M. Bell, M. Belesky, M. D’Agostini, M. Lapierre, M. Grabner, M. Richards, N. Horton, N. Foligno, O. Saprykin, P. Mueller, P. O’Sullivan, P.A. Parenteau, P. Bergeron, P. Kaleta, P. Gaustad, P.M. Bouchard, R. Nash, R. Torres, R. Vrbata, R. Callahan, R. Clowe, R. Getzlaf, S. Matthias, S. Ott, S. Upshall, S. Veilleux, S. Weiss, S. Gomez, S. Hartnell, S. Avery, D. Setoguchi, S. Crosby, S. Gagne, S. Bernier, S. Downie, T. Kennedy, T. Moen, T. Pyatt, T. Connolly, T. Hunter, T. Brouwer, V. Lecavalier, V. Fiddler, W. Wolski, Z. Smith, Z. Stortini
When I run the correlation for their non-power play points in junior versus in the NHL and for their power play points in junior versus in the NHL, the results were surprisingly stronger than I expected:
Non-Power Play points: 0.584
Power Play points: 0.550
While I was hoping to see a strong correlation between even strength scoring in juniors and in the NHL, it was surprising how strong the correlation is for power play scoring. I thought role players, who received a good amount of power play time in junior, but little of it at the pro level would weaken the correlation, but that does not turn out to be the case.
So what does a correlation of 0.584 an 0.550 mean? Well in this case, it is better to look at the adjusted r-squared for each correlation as this gives us an idea of how much of a player’s scoring at the NHL level is explained by their junior scoring in their 17-year-old junior season. The adjusted r-squared is:
Non-Power Play points: 0.336
Power Play points: 0.298
This tells us that when it comes to forwards, junior stats matter and they matter a lot. Almost 1/3 of their total scoring in the NHL is explained by their junior scoring. A basic rule of thumb is if they cannot score in juniors in their 17-year-old season, it is very unlikely their going to be scorers at the NHL level. There are always exceptions to the rule (Lucic) but few and far between.
To use Sidney Crosby as en example, let us see what the model predicts for his best season in the NHL. The formula for the model is (0.231+.404*Non-PP Pts/G) + (0.0479+0.355*PP Pts/G). The results are:
Non-Power Play: 0.89 Pts/G
Power Play: 0.41 Pts/G
Total: 1.30 Pts/G
Crosby easily exceeds this expectation, as his best season in the NHL (minimum 40 GP) was 1.61 Pts/G in the 2010-2011 season. While it will never be close to 100% accurate for predictions, what the model does do is give us a line in the sand of what is reasonable to expect in terms of production. We can also add in the standard error for the regression model, which when we double the standard error, it will give us a 95% confidence interval (spread) of where the player’s best season will fall. Basically, this means that with 95% certainty if a player goes on to play 250+ NHL games, his best point per game season will fall somewhere within the spread. Twice the standard error for non-PP points and PP points is:
Non-PP: 0.290
PP: 0.253
It is a huge spread of over +/- 25% and shows that the model is not going to be very precise in its accuracy, but it does provide for more information than a scout just saying a player will be a 2nd or 3rd liner. Instead, we can give a point range where their career season will likely will end up. For Sidney Crosby, with 95% confidence, his best season would fall somewhere between 0.94 and 1.66 Pts/game so Crosby fell on the higher side of the spectrum.
Now that we have a model to help predict NHL offensive potential, let us see if we can improve it through adding age into the equation. However, before we do that, in my next article we will see how well this same model applies with defenseman and maybe find out some reasons why it is so damn difficult to draft a defenseman.
Next Article April 9: Drafting D-Men is Hard
One thought on “PREDICTING THE FUTURE”