How To Ruin An Advanced Stat

Share Tweet Pin

If you watch sports as much as I do, you’ve no doubt recently been confronted by Some Statistic Powered by AWS. Such insights are everywhere—but why? My knee-jerk reaction to them is often, “Hm! Don’t know about that.” Not because I am an advanced stats skeptic, but because what I see is rarely what I find useful or entertaining while watching broadcasts, which is, and correct me if I’m wrong here, not what an insight is supposed to be.

AWS is Amazon Web Services, a company that offers products like machine learning tools and cloud storage and servers to customers, including partnerships with various sports leagues. One stated purpose of AWS’s partnerships is to enhance fan understanding and engagement in real-time on broadcasts. (AWS also does more on the back end.) The company partnered with Formula One in 2018 and, from then until now, provided resources for car development and, on the fan-facing end, television graphics that range from boiling down driver performance into an abstruse percentage of the car’s hypothetical maximum performance to actually compelling corner analysis using drivers’ telemetry data to explain which driver was faster around a turn and why. AWS has partnered with a slew of different sports leagues since then, including the Bundesliga, NFL, PGA Tour, and NHL, to much the same end.

Most recently on the NHL front, NHL Edge IQ powered by AWS debuted a face-off probability stat called Face-off Probability. On the Sportsnet broadcast, the graphic looks like this:

My initial reaction was, I will admit, not terribly kind. On the other hand, neither are my current feelings. Where did the numbers come from? Are they accurate? Why are there three significant figures in the percentage—does the .1 percent really matter? What does this actually do to influence how I view the game? Thanks for telling me that the face-off is pretty much 50-50?

A little bit of research, and the answer to the first question—or something of an answer—crops up:

Priya Ponnapalli, senior manager at Amazon Machine Learning Solutions Lab, said Face-off Probability uses more than 70 different data points, from historic and in-game stats, as well as contextual data. Ponnapalli said the artificial intelligence takes 10 years of faceoff results — more than 200,000 draws for all the players in the league today — and uses data that includes a player’s success rate based on faceoff location, home games vs. away games and history against specific opponents. It also factors in personal data such as handedness, height and weight.

ESPN

This helps a bit. A smiley face on a fogged-up windowpane, if you will. But machine learning can be inscrutable by nature—Ben Clemens’s very good article in FanGraphs on the similar issue of probabilities shown on Apple TV baseball broadcasts discusses this as well. In sum, what machine learning does is take a set of sample data, or “training data,” that has various parameters you think influence the result (in the case of Face-Off Probability, location, home vs. away, etc.), and, different from your bog-standard analysis that works to merely draw a conclusion about the dataset, learns how to predict future results given different values of those same parameters.

It’s easy for people to make predictions based on only one factor, such as head-to-head match-up, but add in more variables, and it becomes more complicated to implement and evaluate. You can look at the above quote and say that player weight feels like an unimportant factor in face-off percentage, especially since the players themselves are considered, but an imperfect algorithm might not, or vice versa. Unfortunately, the only way to validate a machine learning algorithm without having it in hand is by looking at how well its predictions actually line up with the results.

A lot of stats out there nowadays are publicly available; that is not the case here. What’s left is to manually gather…