How data finds the truth — in baseball and in business – CIO
NEW YORK — If there is one thing Bill James wishes more people understood about sabermetrics — the method of empirical analysis of baseball that he had a prominent role in pioneering — it’s that the data is not the point. The point is to use the data like a razor to cut through false convictions to find the truth.
“The reason that understanding is so difficult to build in baseball is that there’s an entire industry of people selling nonsensical ideas about the data all the time,” James says.
If a broadcaster tries to tell you how a batter performs against left-handed pitchers on Tuesday nights under a full moon, James advises shutting the sound off.
“All the stuff they tell us during a broadcast is actually a measurement of nothing — meaningless measurements that form an entire universe,” he adds.
[Related: Big data analytics today lets businesses play Moneyball ]
Billy Beane, the general manager and minority owner of the Oakland Athletics baseball team agrees.
“There’s so much noise around it and everyone has an opinion,” Beane says, noting that he doesn’t watch his team play because he gets too emotional.
Instead, Beane studies the data after the fact, saying that gives him the ability to make rational decisions based on the data.
Winning an unfair game
Beane’s predecessor as GM, Sandy Alderson, laid the groundwork for the A’s use of sabermetrics principles to guide its hiring, but it was Beane and his front-office staff that showed the world that the right data could allow a team with a small payroll to identify underpriced talent that in turn could allow it to compete toe-to-toe with some of Major League Baseball’s biggest-spending teams. As a result, Beane was the subject of Michael Lewis’ 2003 book on baseball economics, “Moneyball: The Art of Winning an Unfair Game.” In the 2011 film based on the book, actor Brad Pitt played Beane.
“I’m a former player, allegedly, if you saw my stats,” says Beane, who between 1984 and 1989 played as an outfielder for the New York Mets, Minnesota Twins, Detroit Tigers and Oakland Athletics. “I was judged the traditional way: the eye test. I was measured by skills that weren’t really relevant to playing the game.”
[ Related: How the Red Sox brought new tech to baseball’s oldest park ]
“I was a misjudged asset in my own career,” he adds. “Then the whole world opened up to me. [Sabermetrics] turned the world and the game into a mathematical equation that was easy to understand. And now sports teams are trying to hire the same people that NetSuite wants to hire, that Google wants to hire.”
Ignorance as opportunity
Speaking at the NetSuite NYSE Disruption Summit at the New York Stock Exchange on Friday, James and Beane both expressed that in business, as in baseball, the key to making the most of the data you’re collecting is to let go of preconceptions and start thinking of areas of ignorance as mines of opportunity.
“Ignorance is inexhaustible and a vast resource to all of us,” says James, now senior advisor on Baseball Operations for the Boston Red Sox, a far cry from his early days writing about baseball stats while working as a night-shift security guard at Stokely-Van Camp’s pork and beans cannery. “Whenever you find something that you do not know, that you could know, that’s gold. That’s an opportunity to turn lead into gold.”
“It’s taking bucketfuls out of an ocean,” he adds. “Every field is shot through with things that people are convinced are true but just aren’t true. Areas of growth are based on discovering those things you know that just aren’t true.”
The trick is to look into that sea of the unknown and understand what is quantifiable and what isn’t.
“Baseball is at a point in which the new data has piled up so deep that it will take us a long time to dig our way out of it,” James says. “The first thing that will happen with all that new data is that a lot of fictional beliefs will grow out of it; there will be a lot of mushrooms that grow out of it. Eventually we’ll get to the value, but it will take us quite a while.”
Can you measure chemistry?
One case in point: team chemistry. It’s often cited, but rarely quantified.
“The word collaboration is synonymous with chemistry in sports,” Beane says. “Despite the turnover, [the Oakland A’s] always been known for having great chemistry. But when we’re doing poorly, the chemistry is poor. There is chemistry, but it’s a byproduct of success. Usually bad baseball teams have bad chemistry.”
[ Related: The Internet of Things comes to the NFL ]
“I think chemistry on teams is very real and I think it has value, but we’re very, very bad at quantifying it,” James adds. “Sometimes the guys who contribute the most to chemistry one year contribute the most negativity the next year. Success breeds entitlement.”
“If they become entitled, they become too expensive for us,” Beane quips.
Another example: arguments in the sabermetrics community around pitch framing, the ability of a catcher to present the catch of a pitch in such a way that the umpire calls a strike even if the pitch just missed the strike zone. Some zero in on pitch framing as a key measure of a catcher’s defensive ability. Others consider the metric another bit of esoterica, or question the ability to quantify it at all.
“Everything you can quantify, we’re going to quantify,” Beane says. “A catcher called a good game? How do you quantify that? Pitch framing? Every question that we can boil down to raw objective information is what we’re trying to do. But unless your data is free of any human collection, there’s going to be some error there.”
In the end, James and Beane say the next frontier for sabermetrics will be the study of actual production versus potential.
“We haven’t studied that as well as we should have or could have,” James says. “Most of the industry is making decisions based on what the players did last year. If you could make a decision based on what the player is going to do forward, you can turn it into a marketable number, which is production.”
“We’re essentially buying stocks and we want to pay for future performance, not past performance,” Beane adds. “Nobody cares what you did the previous five years.”