The math behind hitting in baseball when the game’s not on the line – Baltimore Sun
Now 34 and preparing for his 13th year in major league baseball, Baltimore Orioles shortstop J.J. Hardy is perceived by some to be in a state of athletic decline.
The same player who slugged 30 home runs as recently as 2011, for example, belted just 8 last year, and injuries have forced him to miss about 30 percent of his team’s games over the past two years.
But there are many ways to measure success in a sport as complex as baseball, and if a team of computer scientists at Johns Hopkins University is to be believed, Oriole fans might have reason to feel hopeful about the two-time All-Star.
A study led by Anton Dahbura, a research scientist in the computer sciences department at Hopkins, revealed a striking dichotomy; while Hardy was all but useless as a hitter in 2016 when the outcome of games was already more or less decided, he hit nearly 200 points higher — more than .290 — when the results hung in the balance.
The finding is among the more interesting nuggets to appear in “Padding the Stats: A Study of MLB Player Performance in Meaningless-Game Situations,” a 55-page paper that Dahbura made public in December. A lifelong baseball nut, Dahbura wrote with the help of Jaewon Lee and Evan Hsia, student researchers and engineering undergraduates who also love the game.
The project examined how every major-league hitter performed last season when, by the authors’ calculations, either team in a given game had a 95 percent or better chance of winning.
Dahbura said it’s beyond the study’s scope to assign definitive meaning to such figures, but the baseball fan in him can’t help speculating that they open up new lines of inquiry in a sport that is already one of the most rigorously analyzed in the world.
“What does it tell you that Hardy did so poorly when a game was already decided, batting a mere .100 in those situations, but so dramatically better when it wasn’t?” he asked. “It’s hard to say with certainty at this point, but the numbers are so striking they’re very likely telling us something.”
The goal of the study, Dahbura said, was to raise awareness about the fact that not all at-bats during a season are equally important.
Hardy’s performance was actually a striking exception to the trend the team set out to explore.
“Some players have been able to significantly improve their overall season statistics by maximizing their performance” in so-called meaningless game situations, the article reads.
Dahbura, 56, is one of those baseball geeks lucky enough to have a passion and a gift for mathematics and statistics. It’s a blend of talents in growing demand in baseball front offices as franchises increasingly seek to blend the benefits of computer-aided analytics with the intuitive wisdom of more old-fashioned scouting.
A former player, coach and manager at Johns Hopkins, Dahbura — now executive director of the Information Security Institute, a center for cybersecurity education and research within the university’s computer sciences department — said he first became interested in how players perform in what he calls meaningless game situations in 1999 and 2000, when the temperamental slugger Albert Belle played for the Orioles.
He always suspected Belle tried harder, upped his game and padded his personal stats in low-pressure situations that mattered little to his team. The databases of baseball information needed to test his hunch had not yet been established, however.
In the years since, as Dahbura — who earned a PhD. in electrical engineering and computer science from Hopkins in 1981 — made his way through a variety of successful gigs in private business, including at Hub Labels Inc., a Hagerstown printing company his parents founded.
In 2010, Dahbura, who is known for his work with multiple community organizations in Hagerstown, became a partial owner of the Hagerstown Suns, a Class A minor league affiliate of the Washington Nationals.
His practice of donning a Suns’ uniform and shagging flies during batting practice has made him a local celebrity — several articles in the Hagerstown Sun have noted that the players call him “Shag”—and he counts several current big leaguers as friends, including San Francisco Giants pitcher Matt Cain and Nationals righthander Steven Strasburg.
The irascible Belle retired years ago, but Dahbura’s interest in “meaningless-game situation” performance persisted.
Last year, he and his research team began tapping into baseball databases now available to the public to quantify the effect.
They divided their project into two phases. First the researchers defined “meaningless game situations” —somewhat arbitrarily, Dahbura conceded — as scenarios in which history shows there is less than a 5 percent chance of the trailing team overcoming its deficit.
A sample of more than 9,600 big-league games played between 2013 and 2016 revealed that the threshold is met if a team has a seven-run lead in the first inning, a six-run lead in the second through seventh innings, a five-run lead in the eighth or a four-run lead in the ninth or later.
Nearly 22,000 plate appearances by 781 hitters took place under such circumstances in 2016, 11.4 percent of the more than 184,000 total plate appearances by hitters that season.
In the second phase, the team ran the numbers on every major-league hitter for the year, working up lists of the top ten players in several offensive categories. Even longtime fans might find many of the results surprising.
Some of the sport’s most feared hitters performed formidably when it mattered least. Those included the Colorado Rockies‘ Carlos Gonzalez (10 homers, the most in baseball) and the Toronto Blue Jays‘ Edwin Encarnacion (a .442 batting average, 179 points higher than his overall mark).