Not long ago, a mild-mannered former mathematics professor named Sean Forman met Cal Ripken at a Baseball Hall of Fame gathering. This is the sort of thing that Forman could never have imagined he’d be doing 16 years ago, before he created a website, Baseball-Reference.com, that completely altered the way Internet-savvy fans consume the sport. In fact, this is the sort of thing that still blows Forman’s mind, especially when Ripken said to him, “I emailed you guys the other day.”
Until then, Forman knew nothing about the email. He is now the head of a company of seven employees (including himself) called Sports Reference LLC, which began back in 2000 as a way to avoid writing his Ph.D. dissertation at the University of Iowa, an attempt to compile the first comprehensive baseball encyclopedia on the Internet, and has now expanded to include every major American sport. (Still, baseball, given its historical bent, is at the heart of the company’s mission.) Forman’s business is not exactly huge, but somehow the person who handles customer service had neglected to inform him of Ripken’s inquiry (a stathead question germane to Ripken’s secondary career as a broadcaster), which led Forman to impose a new rule: When a Hall of Famer sends an email, you tell the founder.
It’s a measure of how indispensable Forman’s website has become that everyone with even a passing interest in baseball uses it at some point, from casual fans to sabermetric seamheads; recently, IBM announced it would feed data from Baseball-Reference to its supercomputer Watson to help develop a fantasy-baseball website. Baseball-Reference gets multiple emails a day from users seeking information, offering corrections or asking for help; occasionally, they get emails from players or former players who are involved in financial disputes asking the site to remove their salary information (though former Toronto Blue Jays pitcher Dave Stieb wrote in to fix his).
“In the past few years there’s been an explosion in analytical baseball websites,” says Grantland baseball writer Jonah Keri. “And yet, Baseball-Reference remains indispensable. It’s perfectly laid out in terms of finding historical data you want, or even something as simple as David Price’s track record in the playoffs. If B-Ref ever had a prolonged outage while I was on deadline with a research-heavy story, I’d hurl myself out of the nearest window.”
When Forman began all of this, of course, virtually nothing was available online. It was 1999, and Forman was close to finishing his dissertation in applied math and computational sciences, but he also had a side interest in sabermetrics, writing articles and blogging. He realized then that the Internet would be the optimal place for a baseball encyclopedia. Historical statistics back then were contained almost entirely in massive tomes like Total Baseball (which had a minimal web presence) and Bill James’ ouevre of Baseball Abstracts (which were no longer being published); I was working at a newspaper in Ohio back then, and people would call the office multiple times a day from various drinking establishments to have us dig into encyclopedias and settle bets for them. And so Forman built his original database from a CD-ROM included in the Total Baseball books; eventually, he began licensing data by more above-board means, and by 2006 he’d quit his job as a professor at St. Joseph’s University in Philadelphia and began working full-time on the site. “I definitely did not intend to create a company with seven employees,” he says.
But what makes Baseball-Reference so consistently brilliant is its inherent simplicity. While Forman admits the site takes a pro-sabermetric bent, it mostly tries to subsume any sort of editorial advice in favor of objective compilation of the numbers. It is the not the kind of site that would ever stop listing, say, pitcher win totals or RBIs merely because those categories have been labeled as irrelevant by the sabermetric crowd. The only criteria for including something on the site, Forman says, is based on the notion of whether it’s going to answer a user’s question.
“We respect the history of the game,” Forman says. “We’ll list [Wins Above Replacement], but we’ll also go back and try to get the correct RBI totals for 1925. I think it’s a false dichotomy, personally.”
And because Forman refuses to adhere to doctrines, Baseball-Reference is not just a site for hardcore stat-geeks, and it is not just a site for casual fans. It is a site for a general audience that can also get weirdly specific. Often, Forman says, employees of varied front offices – who have access to their own internal databases – will choose to use Baseball-Reference instead, due to its ease of use. “If our Dominican Summer League stats don’t get updated in time,” Forman says, “we’ll get a guy in, say, the Mariners’ front office emailing us to ask where they are.”
There is nothing complex or daunting about Baseball-Reference’s player pages. It’s simple to sort and collate the numbers in a myriad of ways, and the pages themselves often unravel on one’s computer screen or smartphone like tiny modernist masterpieces. Take Rickey Henderson’s page, one of Forman’s all-time favorites, studded with bold numbers recognizing his league-leading abilities; or take Barry Bonds’ page, which, when stripped clean of the PED argument, is merely a compilation of mind-boggling statistical brilliance. “His peak years are pornographic,” says Keri, who also has a thing for the pages of middling baseball veterans like Ray Oyler.
And so there are pages like Bonds’ that draw immense amounts of traffic (each page is available to be sponsored, which is one of B-R’s primary revenue streams), and there are random pages that manage to capture the quirks of a sport that has been around long enough to engender nearly every imaginable variation. Want to see a guy get caught stealing second base twice in the same inning? It’s there. Or that time Robin Ventura hit a walk-off grand slam in the 1999 NLCS, only to have it be scored a single because three of the runners never crossed home plate? Here it is. There is a record of Tim Lincecum throwing a pitch on a 4-2 count, and there is a record of an inexplicably weird situation involving Bengie Molina of the Giants. It’s all there, a game often riddled with complexity boiled down to its essence, with inside jokes buried in the mix (i.e. the compilation of Oddibe McDowell’s water bills at the bottom of his page).
Nearly every day, Forman says, he finds himself studying the page of a player he knew little-to-nothing about before he clicked. The site has become a repository, a metaphor for both baseball’s enormity and its inherent simplicity, the best possible use of the Internet’s bandwidth. Forman and his team continue to work to optimize Baseball-Reference’s plan for mobile, since most of their visits come from smartphones at this point; between ads and premium memberships, they’re making decent money, though Forman recently attempted to explain on Twitter why ad-blocking software could prove a slippery slope for sites like his. “We’re doing fine,” he says, “and we’ll continue to do fine. But I’m not sure people would prefer an ad-free Internet. I think they would be shocked at how much content they would miss.”
But for now, Baseball-Reference misses almost nothing. Forman likes to tell the story of how Royals’ vice president of baseball operations George Brett was once asked which pitcher he hit the best, then spent the next two hours in a rabbit-hole on B-R looking up various games he played in. And that’s the inherent magic of the site: It manages to anticipate the answers to your questions before you can even think of them.
Michael Weinreb is the author of Season of Saturdays: A History of College Football in 14 Games, now out in paperback. You can find him on Twitter @michaelweinreb