Everything You Thought You Knew About Baseball Might Be Wrong – The Federalist

The game of baseball changes very little from year to year. There are occasional, minor shifts, like the introduction of the automatic intentional walk this year, or tweaks like the Utley rule that refine how baserunners can slide into a base, but the game itself hardly changes, and that continuity and imperturbability is a part of its charm.

Baseball analysis, on the other hand, has undergone a massive realignment in the past two decades. New technology for measuring success on the field and new statistics for analyzing the old data and the new have turned baseball conversations on their ear and added a whole new vocabulary of WAR, wOBA, WPA, and FIP to the familiar measurements of batting average, RBI, wins, ERA, and saves.

If you’re a baseball fan who wants to expand your knowledge of the new stats but doesn’t know where to begin, Keith Law’s Smart Baseball: The Story Behind the Old Stats That Are Ruining the Game, the New Ones That Are Running It, and the Right Way to Think About Baseball is for you.

Out With the Old

For almost as long as men have played the game, they have measured their performance, and many of the stats we use today date back to the baseball’s early days. The venerable measurement of batting average, flashed across our television screens at the beginning of each at bat, was devised by sportswriter Henry Chadwick in the mid-19th century. Chadwick also invented the earned run average (ERA) to measure pitchers’ performance. There is nothing complex about these calculations, and they convey information to the reader about what happened in the games. As simple, accessible units of measurement, they served their purpose and still do.

Less useful, in Law’s eyes (and the eyes of most modern observers of the game) are the pitching statistics of wins and saves. The win (or “pitcher-win” as modern statisticians name it) sounds as simple as batting average or ERA: how many games did the team win while the pitcher was the pitcher of record. But it is that last phrase, “pitcher of record,” that is deceptive. Where 19th-century pitchers routinely finished the games they started, assigning credit for a win was simple. Much like the equivalent win stat for goalies in hockey, the appropriate recipient of the win was usually clear: the player who starts and finishes the game at the relevant position. The vast increase in the use of relief pitchers has destroyed this connection, and made the stat less useful in evaluating a pitcher.

The larger problem for Law and other practitioners of the new stats is that assigning a win or loss to a pitcher often credits or blames him for things beyond his control. This is so obviously true that it is scarcely necessary to argue it: A pitcher can pitch the best game of his life and lose it because of his teammates’ errors; he can give up ten runs and get the win because his teammates scored eleven. Winning games does not necessarily correlate with pitching well.

But this is just a more egregious example of the conflict between the old stats and the new: The former seek only to explain what happened; the latter want to find useful indicators of a player’s actual talent level and value to his team. Law, who once worked as an analyst for the Toronto Blue Jays, is understandably more concerned with player value, as are the current employees of baseball teams.

For the casual fan, though non-predictive stats still have value in explaining what happened. Does Eric Bruntlett’s execution of an unassisted triple play tell us anything of value about his talent as a fielder? Not really. But it does tell us about a meaningful thing that happened in the game, a rare feat that was enjoyable to watch.

Can the Save Be Saved?

Law reserves a special ire for those stats that not only fail to predict performance but also, in his mind, distort the way the game is played. Runs batted in (RBI) comes in for criticism, as it elevates players based more on the situation they find themselves than because of their individual performance. Hitting with men on base, he explains, is not a separate skill from hitting generally, and studies on the subject prove the point. But the save stat is clearly the one Law finds most problematic to the game. It is hard to argue with his conclusions.

The save is among the newest of the old stats, invented in 1960 by a Chicago sportswriter and codified in baseball’s rules in 1969. The intent was valid: an effort to demonstrate the value of relief pitchers, who were by then beginning to play a larger role in the game. But the narrowness of the stat, in only rewarding the last pitcher in a game and doing so largely independent of their actual performance, replicates many of the problems of the pitcher-win metric. Instead of measuring how well the pitcher performed against the batters he faced, the rules of the save assign a mystical importance to the final inning and reward all pitchers who finish the job equally, whether they dominate their appearance or just barely squeak by.

What’s worse is the way the save has changed the way the game is played. Any measurement that changes the thing it is measuring diminishes its own value as a metric, and there is none worse in that respect than the save. As Law explains, the elevation of the closer over middle relievers has created the appearance a stratification that the data do not bear out. Managers routinely save their best pitcher for the ninth inning when their lead is between one and three runs—the conditions that the rules define as a save opportunity—rather than for the highest pressure situations, where his services might be more needed.

Facing the heart of the opposing lineup in the eighth inning might be a more important situation than facing the bottom hitters in the ninth. Pitching in the ninth when the score is tied might be more important than pitching in extra innings with a lead. But the conventional wisdom that has grown up around late-game bullpen management is that the best relief pitcher is your closer and closers must get saves. Managing to a stat is like teaching to a test. The more you do it, the less valuable the metric is, and the less value is being imparted to the ultimate product: winning games.

In with the New

If the old measurements are not getting the job done, what is the answer? In the second section of the book, Law introduces many of the new statistical constructs that are driving the way modern baseball teams are assessing their players’ performance. He promises from the start to keep the math to a minimum and my fellow liberal arts majors will be pleased to hear that the book is admirably free of abstruse calculations.

The new stats’ formulas are more complex than Chadwick’s were, and they are spelled out in this book if you want to read them, but skimming over them will not detract from the reading experiences as Law does a fine job at explaining them in prose, as well as citing more books on the topic to aid in the investigation of any new-made fans of the stats.

The thrust of the statistical revolution—also known as sabermetrics—is the pursuit of stats that isolate a player’s past individual performance in a way that correlates with his future performance. On-base percentage (OBP) was among the first of these to be developed, even before the growth of sabermetrics as a field. It is not dissimilar to batting average, but it measures the result: reaching base, whether by getting a hit or by drawing a walk, where earlier stats attributed walks solely to the pitcher’s mistakes rather than the batter’s patience and discerning eye at the plate.

Other stats followed. Weighted on-base average (wOBA) gets deeper into the effects of a batter’s performance at the plate, measuring the value of various ways of reaching base based on how likely those events are to result in the team scoring a run. Win probability added (WPA) takes in even more offensive information, including baserunning, and evaluates how each plate appearance increases or decreases the team’s likelihood of winning.

Wins against replacement (WAR) goes even farther; as Fangraphs describes it, WAR is a “comprehensive statistic that estimates the number of wins a player has been worth to his team compared to a freely available player such as a minor league free agent.” New stats are being invented all the time—just this month Nate Silver created the “Goose Egg,” an attempt to evaluate relief pitchers more accurately than the save stat does.

No Need To Choose

There’s a lot that goes into all of those stats and, in the case of WAR, there are disagreements on how to calculate it, but the point is all the same: how to capture player value in a way that conveys the most information quickly and accurately. Reaction to sabermetrics has been mixed. Some fans enjoy the increased ability to understand the game. Others feel that the trend is a part of the Big Data revolution that takes the spontaneity and artistry out of life. Sure, the teams want to know about value, but “real fans” want to watch a game, not a danged equation.

Much of that latter point is a reaction against a false choice. Learning the new math of baseball does not mean forgetting the beauty of the game. Understanding the launch angle and exit speed of a home run (something MLB’s new Statcast system allows us to do) does not mean that we cannot also leap out of our seats with excitement when that same round-tripper puts our team ahead of a hated rival.

Law addresses this in the later chapters of the book when he talks about the changing role of scouting. New stats and new ways of collecting data take some of the guesswork out of evaluating potential draft picks, but that only redefines the talent scout’s job. It does not eliminate it. Freed from having to guess at a pitcher’s arm-strength or a player’s batspeed, the scout can concentrate on the intangible aspects of the person, his attitude, his intelligence, his work ethic, and how he would fit in on a team. The objective data are now freely available to MLB teams. Getting to better understand the players as people, as Cubs general manager Theo Epstein recently suggested on the Pardon My Take podcast, may be the next “Moneyball” revolution.

While Law’s description of mathematics is readable, some of his historical analysis falls short, most notably where he states three times that MLB raised the height of the pitcher’s mound in 1968, something that did not happen. (The mound was the same size in 1968 as it had been since 1904. MLB lowered the height in 1969 to increase offense. They did increase the height of the strike zone from 1961 to 1968, which may be the source of his confusion.) On other historical elements, he is stronger, and his case for Lou Whitaker’s induction to the Hall of Fame, for example, is quite convincing.

Quibbles aside, Smart Baseball is a readable, enjoyable book, and a helpful introduction to sabermetrics for any baseball fan who is interested in the new stats but intimidated by the opaque math that goes into them.