The art of hockey analytics
NHL teams are crunching numbers like never before
Read this article for free:
Already have an account? Log in here »
To continue reading, please subscribe with this special offer:
All-Access Digital Subscription
$1.50 for 150 days*
- Enjoy unlimited reading on winnipegfreepress.com
- Read the E-Edition, our digital replica newspaper
- Access News Break, our award-winning app
- Play interactive puzzles
*Pay $1.50 for the first 22 weeks of your subscription. After 22 weeks, price increases to the regular rate of $19.00 per month. GST will be added to each payment. Subscription can be cancelled after the first 22 weeks.
Hey there, time traveller!
This article was published 19/04/2014 (3085 days ago), so information in it may no longer be current.
Blake Wheeler is in full stride with the puck seemingly glued to his stick. And, in that precise moment in which he makes a sharp turn to the Los Angeles Kings goal, it only seems like defenceman Slava Voynov — vainly turning in pursuit — has suddenly morphed into one of those giant orange traffic pylons.
In the next instant Wheeler is in front of Kings netminder Jonathan Quick, before slashing across the crease and depositing the puck in the back of the net.
Watching the Winnipeg Jets gifted winger flash the best of all his skills — the world-class speed meets a deft scoring touch — is the kind of scene that often has fans sliding to the front of their seats at the MTS Centre or bolting up right from the recliner at home in anticipation.
Using analytics for any kind of edge is hardly new phenomenon in sports, even if the buy-in from professional teams must have seemed glacier-like for early innovators like Bill James.
And to paraphrase Don Cherry, it doesn’t take a “rocket surgeon” for any hockey fan to see that brilliance from the Jets’ leading scorer in 2013-14 with their own two eyes. The basic numbers have confirmed all that, telling us that Wheeler set career highs in both goals and assists this season.
But some of the common analytics now in use around the National Hockey League — the “fancy stats” as many refer to them — can tell us all that and more about Wheeler. He has a 5v5 Corsi rating of 49.9 per cent, a Fenwick of 50.3.
“Huh?” began Wheeler when asked about the Corsi or Fenwick ratings last week. “I don’t even know what that is. I’ve never even heard of it.
“Maybe one day when I’m a GM I’ll look into it.”
Welcome to the NHL of the 21st century, where a player is no longer evaluated primarily on the basics such as goals and assists, his plus-minus rating or goals-against average. In fact, if all the numbers now available to general managers were displayed on the back of a player’s hockey card, they would roughly be the size of a highway billboard.
This is all part of the NHL’s data revolution — an analytics arms race where the 30 teams are not only waging war on the ice, but looking for every kind of advantage they can find off it by crunching numbers into a variety of metrics.
It’s all a bit cloak and dagger, too, as teams are hardly sharing their findings and best practices with one another. Case in point, Jets’ GM Kevin Cheveldayoff agreed to speak to the Free Press about hockey analytics, but politely refused to deal in specifics as it related to his squad.
“I do have strong beliefs on it,” explained Cheveldayoff. “We do some things done that are maybe different than an average team, although I can’t speak for them. What we talk about is not so much following the trends but trying to get out ahead of them. Like anything, you’re always trying to be innovative whether it’s your on-ice training or off-ice training, conditioning, nutrition.
“But with the popularity of Moneyball in baseball I think nowadays everybody in sport is saying, ‘What else can we do?’”
“The problem we’re trying to solve is that there are rich teams and there are poor teams. Then there’s fifty feet of crap, and then there’s us. It’s an unfair game. And now we’ve been gutted. We’re like organ donors for the rich. Boston’s taken our kidneys, Yankees have taken our heart. And you guys just sit around talking the same old ‘good body’ nonsense like we’re selling jeans. Like we’re looking for Fabio.
“We’ve got to think differently. We are the last dog at the bowl. You see what happens to the runt of the litter? He dies.”
— Scene from the movie Moneyball in which Brad Pitt, playing Oakland A’s GM Billy Beane, outlines his new approach to finding players to his old-school scouting staff
* * *
Using analytics for any kind of edge is hardly new phenomenon in sports, even if the buy-in from professional teams must have seemed glacier-like for early innovators such as Bill James.
An American baseball writer, James self-published his first book, The Bill James Baseball Abstract, in 1977.
Studying boxscores — legend has it while he was doing night shifts as a security guard at a pork and beans cannery in New Jersey — James offered up info that revealed, for example, which pitchers and catchers gave up the most stolen bases against.
Those findings — he called them “sabermetrics” in reference to the Society for American Baseball Research — were so well read they became the precursor for today’s sports analytics explosion.
James, FYI, is now a senior advisor on baseball operations for the Red Sox and in 2006 was named by Time as one of the 100 most influential people in the world.
What James did to change the thinking in baseball, the movie Moneyball — based on the 2003 book of the same title by Michael Lewis — ultimately opened even more eyes across the sporting landscape.
But changing the thinking of old-school baseball scouts, which was one of the dominant themes in the movie, just barely scratches the surface of what analytics are providing for pro sports organizations.
Not only does it help them track things like the buying habits of their customers, but on the field of play — Dallas Mavericks owner Mark Cuban, as an example, insists the use of advanced stats help them win the 2011 NBA championship by telling them which players were best suited in a matchup with the Miami Heat.
The Mavs, as an example, started guard J.J. Barea midway through the series — even though he was shooting 5-of-23 at the time — to utilize his speed against the Heat and a matchup with Miami guard Mike Bibby (two games later Bibby was on the bench). As well, Dallas switched back and forth between zone and man-to-man defences to take Miami out of any kind of offensive rhythm.
So instrumental was their use of advanced stats, Dallas had their director of basketball analytics Roland Beech on the bench with head coach Rick Carlisle and gave him the unofficial title of “first stat geek with a championship ring.”
The growing sophistication of analytics might be best represented by the list of featured speakers at the 2014 MIT Sloan Sports Analytics Conference in Boston, the highly-prestigious gathering of sports’ forward thinkers that featured NBA commissioner Adam Silver, Indianapolis Colts’ quarterback Andrew Luck, San Francisco 49ers president and co-owner Gideon Yu, John Henry, the principal owner of the Fenway Sports Group, and author Malcolm Gladwell.
One of the hockey research papers at this year’s conference, as an example: Tilted ice: How certain National Hockey League Teams are Manipulating the League’s Point System. A description, from the Sloan website:
“We use generalized linear models to find (i) a significantly higher fraction of nonconference overtime games, when compared to conference ones, and (ii) a subset of teams which have most often modified their on-ice behavior, as shown through more non-conference overtime games and lower scoring rates in the third period of tied contests. The varying overtime frequencies and passive on-ice behavior appear to be unintended consequences of the league’s policies, which, under the league’s 2013 realignment, will encourage even grosser manipulation.”
Now, what has made hockey a bit slow to the analytics party is the very nature of the game. Unlike baseball, football or basketball, the fluidity to the action makes it more difficult to track outcomes and interactions between players. That said, since the 2007-08 season the NHL has provided documented play-by-play descriptions of games that detail not only face-off wins and losses, but hits, and who is on the ice at any one particular moment.
That’s the set of data from which many metrics are built, others are using video to break down the game even further and evaluate players and prospects.
The Jets have their video coach Tony Borgford do a lot of their pre-scout work and provide a layer of analytics on a daily basis for offensive and defensive zone starts, giveaways, turnovers, scoring chances for and against and where they come from on the ice.
As for the rest of what they do, well, again, that’s as secret as Cheveldayoff’s PIN code.
“Over the course of a season it’s interesting to look at all these stats because they can show you trends,” said Cheveldayoff. “But there’s a deeper level of statistical and rational analysis of what’s going on in the game that we can take into consideration as well.
“Believe me, lots of people — educated people — are offering their thoughts not just on the Xs and Os of the game and trying to find a different level of understanding. They have found ways to dive behind scenes of NHL packages and do things that are more creative.
Old Stats: Plus-Minus
A player gets a ‘plus’ if he is on the ice when his team scores a goal; a ‘minus’ for a goal against.
Background: first used by the Montreal Canadiens dating back to the 1950s.
FYI: The NHL awarded a NHL Plus-Minus Award to the player with the highest plus-minus statistic during the regular season from 1982–83 to 2007–08.
The highest plus-minus total ever recorded in one season was Bobby Orr, at +124, in 1970-71.
Plus-minus is largely now ignored by the analytics community because it is purely a goals-based stat — and goals are relatively rare — that doesn’t reflect puck possession, quality of linemates or players with defensive responsibilities.
‘Scientia potentia est’ (Latin for ‘Knowledge is power’)
— English scientist and philosopher, Sir Francis Bacon (1561-1626)
Meet Eric Tulsky, a 38-year-old Philadelphia native who holds a physics and chemistry degree from Harvard, a PhD in chemistry from UC-Berkeley and now works for an energy storage firm in the Silicon Valley.
“It’s fun when you go to a sports analytics conference and you are having a conversation and someone from a team casually throws out, ‘Oh yeah, like you said in that article about (New York Rangers’ goaltender Henrik) Lundqvist a couple of months ago.’”
“I’d rather not name the company,” he explains in a telephone interview from San Jose, “because the work we’re doing is really secretive and we are in stealth mode.”
Tulsky’s research, according to his bio, “has helped enable unique nanotechnology solutions to problems in DNA sequencing, solar energy, displays, and energy storage.”
And then there’s this: he’s a huge Philadelphia Flyers fan.
Roughly four years ago Tulsky began writing for Broad Street Hockey, a Flyers’ fan blog, after educating himself on some of the basic metrics being used by hockey analytics, like shot differential and puck possession.
Unearthing trends he found interesting, he began emailing NHL teams with his results and theories. He heard back from one team, the Nashville Predators, who asked him to crunch some numbers for them.
“Basically, they had seen some stuff on video they were interested in and they asked if there was any way I could use stats to validate what they thought was happening, that it wasn’t something that had happened a couple of times that they were overreacting to,” explained Tulsky. “They had a few questions that were along those lines and it turns out most of the things they were talking about I could back up.
“I don’t know if I changed anything because it was mostly in line with what they had seen, but it gave them a little more confidence.”
Over the last few years Tulsky has become one of the most-respected thinkers in hockey analytics and has spoken at the last two Sloan conferences, promoting his thoughts on topics like offensive-zone entries, while writing for NHLnumbers.com and starting TZ Quantitative Analytics Group with partner Derek Zona. His findings — that players who carry the puck across the blue line produce twice as much offence as those that dump and chase it — is the exactly kind of hard data all NHL GMs and head coaches crave.
“It’s fun when you go to a sports analytics conference and you are having a conversation and someone from a team casually throws out, ‘Oh yeah, like you said in that article about (New York Rangers’ goaltender Henrik) Lundqvist a couple of months ago…’” said Tulsky. “That’s pretty neat.
“But hockey is not an easy game to analyze. There are reasons it’s been the slowest to come along. And these guys are acknowledged as the world’s best in hockey decision making so I don’t blame them for being skeptical and slow to buy into a bunch of math that isn’t how they grew up with the game. I didn’t play the game at any level and certainly don’t see the game the same way they do. So I can understand their hesitation to jump right in with whatever I have to say.
“But they’re paying attention now and seeing what people have developed. There’s a handful of teams that are really invested, there’s another larger handful that are trying to pay attention to it but aren’t quite sure yet what they believe and what they don’t and are trying to get a handle on what makes sense.
“Then there are teams that still just don’t get it.”
“The New Jersey Devils are looking for someone with the passion, intelligence and experience to lead its hockey analytics group. The position reports directly to the President and General Manager, Lou Lamoriello. Interested applicants should have a deep understanding of both hockey and advanced data-centric techniques to analyze games, players, and rosters. Experience on/around the ice is a plus while a deep passion for NHL hockey is required. Experience in statistics, data science, econometrics, computer science, or other data-driven fields is a must. The position will office in the Prudential Center in Newark.”
—Job posting on NHL.com
Kevin Mongeon and Mike Boyle grew up together in Iroquois Falls, Ont. They played hockey together and kept their passion for the sport. All sports, really. The two started collecting hockey data in grad school, got to know more people in the game and then began doing the consulting thing as principal owners of their company, The Sports Analytics Institute.
And, like Tulsky, these aren’t just a couple of beer-leaguers when it comes to providing analytics information to their clients, who are spread out over a variety of industries.
Boyle is an Assistant Profession of Information Systems in the David Eccles School of Business at the University of Utah and has a Master of Science from DePaul University and a Bachelor of Mathematics from the University of Waterloo.
Mongeon is an Assistant Professor of Sport Management at Brock University, has a PhD in economics from Washington State University, his MBA from the University of Windsor, and his mathematics degree from a Lakehead University.
A small taste of what they can provide: not too long ago Mongeon and Boyle created PGS (Trademark) — Predicted Goals Scored — that calculated the predicted number of goals for and against while a player is on the ice and “accounts for shot characteristics including but not limited to shot location and type.”
New Stats: Corsi
Measures the difference in shots directed at a team’s net — including goals, saves, blocked shots and missed shots (excluding empty nets) — versus shots at its’ own goal. Used to track both a team rating and for players individually and is an effective measurement for puck possession. The more shots directed at the opposition net by a team or when a player is on the ice, the better the Corsi rating, which is expressed as a plus/minus or as a percentage (example: Blake Wheeler’s 49.9 Corsi is calculated because he was on the ice for 1,103 shots for by the Jets and 1,107 shots against).
Corsi can also be broken down further depending on offensive, defensive and neutral-zone starts (where on the ice a face-off takes place) as well as situationally depending on the score and the quality of competition or linemates on the ice.
Worth noting: Corsi does not take into account special teams or goaltending, which can greatly impact wins. But as a possession baseline, Corsi teams are more likely to make the playoffs and more likely to go further into the postseason.
Background: Created by former Quebec Nordiques and Edmonton Oilers goaltender Jim Corsi, who is the long-time goaltending coach of the Buffalo Sabres.
FYI: Corsi can only be tracked back to 2007-08 when the NHL began providing detailed play-by-play of its games, meaning hockey fans will never know the rating for legends like Wayne Gretzky, Orr, Gordie Howe and the like…
The Top 3 Corsi players in the NHL in five-on-give situations this season were:
1. Patrice Bergeron, Boston (61.2%);
2. Jake Muzzin, Los Angeles (61.1%) and
3. Anze Kopitar, Los Angeles (61.0%).
L.A. (56.7%), Chicago (55.2%) and San Jose (54.6%) had the top team Corsi ratings.
“We take all this raw data and put it into a model or ‘black box’ and out comes some derived statistics that when you look at them in relative terms they more accurately describe what’s going on in the game or what a player is doing from a productivity standpoint,” said Mongeon from his office in St. Catharines. “That then allows management better evaluate the players on the team.
“It’s just a different way to look at existing players and players across the league.”
What Mongeon and Boyle find now is that with so much information out there and more clubs buying in to the power of analytics, they are doing more and more consulting for teams.
Essentially, they are hearing from sports-management types who are asking for help to just weed through all the stuff that is out there, how to determine what it all means and then use it effectively — if at all.
“A lot of teams are stumbling their way to it,” explained Boyle, from his office in Salt Lake City. “Look across the different leagues, look at baseball, at football and basketball… we’re seeing people getting hired in analytics, but he often has other responsibilities. If you look at the head of analytics for any major league baseball team he has engineers working for him. I’m not saying you’ve got to go big, but you can’t take something that is of high importance to you and then stuff it down to the lowest guy on the totem pole and ask him to do five other things at the same time.
“Now, some of the best teams in the best leagues have reached out to us for this help. We had a conversation with a top team in another league, not hockey, and they asked us for help. They said, ‘We’ve hit a plateau. We need to get better.’ But yet we have conversations in the NHL where it’s ‘You need to give me this. I know what I want.’ I’m like, ‘Respectfully, I don’t think so…’
“This is the way most teams have gone about addressing their analytical approach: we got a ton of calls after Moneyball, the movie, came out. Owners were saying ‘What’s your Moneyball approach?’ But, at the end of day, I often heard something like, ‘I just need to have some sort of analytics so that I’m covered’ as opposed to embracing it as an approach. That’s changing now. It’s like a lot of us: our parents taught us a lot of things growing up, but we had to learn a lot of them again for ourselves.
“The NHL started later than Major League Baseball,” Boyle added, “and if they were thinking clearly they would hire people from NFL and MLB to help them and give themselves a shot in the arm and skip through some of the mistakes they would make along the way. Hockey has an advantage in terms of being able to see into the future. Maybe it’s moving faster than we think. But we need a team to go out there and say, ‘It’s good for all of us as a sport if we’re all using analytics, but we’ll just continue to maintain our lead by hiring the best talent and having an analytic agenda than the other organizations.’
“Right now it’s as if somebody has the Colonel’s secret recipe and it’s about how long can they hold onto it.”
“The 9000 series is the most reliable computer ever made. No 9000 computer has ever made a mistake or distorted information. We are all, by any practical definition of the words, foolproof and incapable of error.”
— Hal 9000, the computer from the movie 2001: A Space Odyssey.
FACT #1: The game stats sheet from the St. Louis Blues 3-2 win over the Jets on Oct. 29, 2013 lists both teams as having just one giveaway each.
FACT #2: The 10 games this season in which the Jets were credited with the fewest giveaways all came on the road.
FACT #3: Twice in one month — on Mar. 14 and Mar. 16, both at MTS Centre — the two teams combined for 32 giveaways. (Wpg 18/NYR 14 on the 14th; Wpg 17/Dallas 15 on the 16th).
“For the people that just want to throw it out, they’re missing the boat. For the people that want to use it as gospel, I’d caution buyer beware. It’s somewhere in the middle.”
THEORY #1: The Jets played a near-perfect “road game” in those 10 low-giveaway games (they went 6-3-1).
THEORY #2: There are some inconsistencies from NHL building to building in what constitutes a “giveaway.”
“The NHL pushes out a stats package after every single game that is standard,” said Cheveldayoff. “But the game is watched by humans and the numbers are input by humans. And there’s lots of statistics that are arbitrary. Is that a face-off won or lost? Is that a hit? In some buildings it’s not a hit. The shots on net, what makes up a blocked shot? There’s lots of things left to interpretation that go into that stat package that means you have to have an asterisk beside it.”
Here’s where two philosophies occasionally collide: those “old schoolers” in the game who trust their eyes more than any numbers and the “fancy stats” crowd who roll their eyes at anybody not open to examining their work.
An example: Jets defenceman Mark Stuart has a Corsi For percentage of 47.6, which ranked him 335th of the 437 players measured by the website extraskater.com. And yet the Jets just signed the veteran defenceman and assistant captain to a contract extension.
“What the numbers don’t tell you is who Mark Stuart is matched up against or playing with or is he with the first line or third line?” said Cheveldayoff. “What are the other teams matching up against you? Are they hard matching, are they checking a certain line hard? There’s an art to how each coach approaches a game in those respects and so a guy like Mark Stuart… he’s not the most prolific puck mover or distributor, but Corsi and Fenwick don’t talk about the hits he has every night or the blocked shots.
“There’s value in all those statistical analyses, but there is also an arbitrary nature to it. It’s like plus-minus… plus-minus tells you a part of the story, but it doesn’t tell you the whole story.
“Look at Andrew Ladd… he played 233 games in a row before he went home to be with his wife to be with his baby. You don’t think there were nights where there was no way he should have played? But he did. Statistically he might have looked horrible that night, but he played. What about a guy that has the flu and he guts it out to play?
“To me that’s part of what makes the sport exciting. You can’t sit there and say, ‘This guy is statistically better than that guy, they’re going to win tonight.’ At the end of the night you’re saying, ‘Huh, I never saw that coming.’”
And therein lies both the beauty and the mystery of sports, particularly hockey, where A+B doesn’t always equal C. The power of analytics is it provides a layer of information, it can back up an argument, refute another.
“Statistics have value; ignore at your peril,” said Calgary Flames president Brian Burke at this year’s Sloan conference. “But it’s still an eyeballs business.”
Still, those eyeballs might not always see the entire picture. And for those who are willing to study the numbers, they can be revealing.
“We had a player that was supposed to be a great, shut-down defenceman,” Phoenix Coyotes head coach Dave Tippett told The Arizona Republic last April. “He was supposedly the be-all, end-all of defencemen. But when you did a 10-game analysis of him, you found out he was defending all the time because he can’t move the puck.
“Then we had another guy, who supposedly couldn’t defend a lick. Well, he was defending only 20 percent of the time because he’s making good plays out of our end. He may not be the strongest defender, but he’s only doing it 20 percent of the time. So the equation works out better the other way. I ended up trading the other defenceman.”
“It’s never about analytics or a skilled qualitative person,” adds Boyle, of the Sports Analytic Institute. “It’s always about both. We’re never suggesting analytics replaces a scout and I don’t imagine a world where it does. At the end of the day you always need experts.
“It’s just maybe their roles change and they become better at what they do when those analytics as a structure in part of their decision making.”
That’s critical here in pro sports teams understanding where to take all this in the future because even the most-fanatical analytics disciple will admit that all their data and research still has to be weighed against human factors.
Indeed, for as much as players are more and more being defined by their numbers — including those on the back of their jerseys — there is still a heart beating at the front.
“I’ve never really subscribed to the idea that anything in life is absolute,” said Cheveldayoff. “You approach this in the same fashion. For the people that just want to throw it out, they’re missing the boat. For the people that want to use it as gospel, I’d caution buyer beware. It’s somewhere in the middle.
“You can get too immersed in it and fail to see the other things that matter. You can’t be afraid of it. It’s another piece of information. I don’t know if there is anybody in hockey yet that has found the utopia of statistical analysis.
“At least,” Cheveldayoff added, “not yet.”
Check out these analytic-themed websites: