[ Home ] [ Library ] [ Bookstore ] [ Contact ] [ Search ]

I, Stathead

by Keith Woolner

This essay was originally posted to the Red Sox mailing list on April 30th, 1995.  Certain
parts of the original that don't stand alone separate from the conversation at the time
have been removed.
I'd like to offer some thoughts about the debate that (hopefully) will, if nothing 
else, explain why the stats camp believes the things it does.  This will 
probably be a fairly long post, so those of you who aren't interested in 
such discussions should probably hit the 'Delete' key now. 
 
In understanding the stathead point of view, its important to realize that most  
of us, if not all of us, started off accepting the conventional wisdom as 
our view of the baseball world.  Batting averages were king, clutch hitters 
were scattered throughout the lineup, and wins were the best way to measure 
pitching.  No stathead I know sprung forth from his computer fully formed, 
armed with formulae and spreadsheets, and with no background in baseball  
whatsoever.  We all originated from the same place. 
 
What happens at some point in a stathead's life is that certain aspects of 
the conventional wisdom tend to raise some doubts, or prove insufficient for 
answering certain questions one has about baseball.  It could be the inequity 
in assigning wins and losses to a favorite pitcher who's bullpen always blows  
the lead, or who's teammates are anemic with the bat.  Or it could be 
reading an article by someone with a very different viewpoint on ideas you've 
taken for granted.  Something starts the process off, and you begin to  
wonder if the conventional wisdom is always right. 
 
In order to study some of these questions, you end up turning to the tools 
of analysis that have served others in different subject areas in their quests  
for knowledge -- the scientific method, hypothesis testing, statistics, and 
inference.  
 
The reason these techniques are necessary is that people do not handle this  
type of reasoning well.  The cognitive ability of the human brain inherently  
suffers from certain shortcomings which leave it vulnerable to various 
kinds of misperceptions and errors.  Some manifestations of these errors 
include biases in representativeness, availability, and anchoring.  These lead 
to common errors in insensitivity to sample size, misconceptions of chance, 
the illusion of validity, misconceptions of regression, retrievability of 
instances, and illusory correlation, just to name a few.  All of these 
phenomena have been studied scientifically, and been well-documented (and 
I can provide a reference or two to anyone's who interested). 
 
Since people of all kinds are subject to these problems in observation, it 
should come as no surprise that baseball insiders would suffer from the 
same problems.  This isn't judgemental -- even people who are aware of these 
kind of biases have to work very hard to overcome them -- we all seemed to be 
"wired" to use certain heuristics that don't hold up very well under the 
rigors of logic.  But since very few baseball insiders are aware of these 
biases, and since much of conventional baseball wisdom was developed up to  
a hundred years ago, even before science had discovered these biases, it's not 
surprising that "the book" might need some rewriting in certain areas. 
 
The tools of the scientist or statistician attempt to remove or minimize the 
biases and errors.  We don't rely solely on impressions or selective memory, 
but try to systematically record the events that occur on the field.  We 
don't look at one instance, and draw a conclusion, but rather avoid basing 
judgements on small sample sizes.  We don't accept the words of an expert on 
face value, but choose to investigate to see if what the experts say actually 
happens. 
 
Often times, the stathead camp will, in fact, end up agreeing with the 
conventional wisdom, but its the differences that cause all the fuss.   
Statheads confirmed what the experts thought about lefty-righty platoon splits, 
about not batting your best power hitters leadoff, about the relative  
importance of defense up the middle, and more.  But agreeing with what's  
commonly believed isn't controversial, enlightening, or flashy.  Therefore, 
the  
focus of the debate over "statistics" takes place when the conclusion of those  
who've studied the data disagrees with the commonly held beliefs of the  
baseball establishment. 
 
There's a natural resistance to accepting new kinds of thinking -- most of 
us (statheads included) are comfortable in the zone of the familiar.  In 
fact, there's a quote in today's SF Examiner on a totally unrelated topic 
(the Unabomber investigation, and the growing trend of anti-science sentiments) 
which could easily apply to the baseball statistics debate: 
 
"Scratch most people and you'll get a Luddite.  The fear of knowledge is very  
deep in us.  It's closely related to our fear of change..." 
 
			-- Anne Eisenberg, Polytechnic University 
				quoted by Keay Davidson 
 
I think statheads (including myself) get somewhat evangelical at times because  
we've gone through this shift in thinking.  We've broken through the wall, 
and arrived in a frame of mind where we can investigate our own questions, 
and draw our own conclusion, using tools that we feel are more reliable than 
human intuition.  The fact that we use these tools to answer the appropriate 
questions leads those who haven't accepted this paradigm shift to think 
of us as "obsessive." 
 
If I appear obsessive about statistics and sabermetric methods, it's because I  
think the data are there to answer many of the questions about developing a  
winning baseball strategy.  I'm interested in winning, and therefore I care 
about the factors that go in to making a winning team.  When we look at the 
characteristics of winning teams, or high-powered offenses, we see certain 
factors appear over and over again (like OBP and SLG).  These factors then 
become the basis for a theory on how a good ballclub gets built.  As years 
go by, we can compare the results to what the theory would predict, and  
either discard or refine the theory until, at some point, we end up knowing 
something we didn't know before. 
 
Are there limits to statistics?  Of course there are.  Stats can't comment on  
the aethestics of the sport.  Stats will never explain why I liked watching 
the pitching motions of Joe Hesketh or Bob Welch.  Or why I desparately wish 
they'd bring back the bright red caps of 1975-78.  Park factors are  
insufficient to explain the appeal of Fenway Park, or tell why Wrigley's 
outfield fans throw back opposing homers.  The box score won't record the 
electricity that sizzled in the stands as Roger Clemens gets into the  
fifth inning without surrendering a hit.  The drama and human element are 
beyond the realm of what stats can or need to look at.  And those elements  
are what separates baseball from a random number generator. 
 
Even within the quantitative realm, stats aren't, and will never be perfect. 
There are error bars to every calculation (though nearly every stathead is 
often guilty of omitting them).  Stats can talk about averages and expectations 
and general conclusions.  It changes your knowledge and likelihoods, but 
doesn't mean that the unexpected won't happen from time to time.  I'd think 
you were a fool to send up Jose Lind to pinch hit for Frank Thomas in game 
7 of the World Series, and even if he hits a HR, that doesn't mean I'd want 
you to do it again next time.   
 
There's room to challenge a statistic as to whether its measuring the right  
thing, or whether its the best measure available for what you want to capture 
(as has been the case here lately with DA).  If you can convince me that 
what I'm looking at isn't correct, or can show me something demonstrably 
better, I will gladly discard it.  But I do require evidence to do so, and 
that evidence needs to hold up to the rigors that everything else is subjected 
to. 
 
Another are where stats are of less help is in scouting.  Nothing in any 
of my stats will tell me whether a pitcher's motion is likely to lead to 
arm problems, or whether a hitter's stance is too far up in the box.  I'm 
totally ignorant of how to train someone to read the strike zone, or track 
down a fly ball in the outfield.  Coaches and scouts can provide first-hand, 
experienced commentary on these areas.  They may be able to turn a free-swinger 
into a more disciplined hitter, or help a pitcher learn a curveball.  I can 
measure the results of what the players did, and the relative value of those 
results to the team, and even project how the player might develop based on 
other players with a similar profile, but the scouting reports do add 
information that would otherwise be lost. 
 
There are, contrary to popular belief, baseball insiders whom statheads do 
have admiration for -- in part because they developed similar insights and  
conclusions to what the statheads have discovered.  In some cases, the 
experts did so with the help of quantitative methods, and in other cases, 
their intuition was just a little more refined.  Branch Rickey, Earl Weaver, 
and Davey Johnson are widely admired by the stathead camp, both for the 
success they built, and the techniques they used to achieve it. 
 
Some of you (most of you?) are probably wondering why I've bothered to write 
out such a lengthy reply to a seemingly endless debate.  I wrote it, in part, 
because I think many statheads tend to forget how profound a change it was 
to leave the conventional wisdom behind, and that many traditionalists have 
little idea why statheads are the way they are.  There's a fundamental  
groundwork that needs to be laid to overcome the inertia associated with 
the weight of common baseball beliefs before truer understanding can begin. 
Without that grounding, it looks very much as if statheads choose whatever 
numbers they want to support preconcieved notions.  It becomes a question 
of motives -- are statheads trying to obscure the issue, or shed light on it?   
The necessary trust between parties that answers that question doesn't exist,  
and the frequent disdain expressed for the beliefs of "mediots", owners, GM's,  
and players destroys the statheads' credibility before we even begin. 
 
The frustration for many statheads is that the education process is never 
complete -- there's always going to be someone new joining the list, and 
the same flame wars keep coming up again and again.  Unfortunately, this 
leads some people to develop a short fuse, and attack the first ill-considered 
comments, which compounds the problem.  What's required instead is a leap of 
faith on both sides that simply says that we're both in it because we love 
the sport of baseball (in all of its facets), and that we're both trying to 
reach understanding with different means.  Disagreements will exist, but 
hopefully, a constructive dialogue will lead to insights for both sides.   
 
Does this belong on the Red Sox list?  As long as both advocates for both 
sides keep posting here, it certainly does.  What we're arguing about is 
the basis by which we evaluate Roger Clemens' greatness, or Cooper's fielding, 
of Whiten's hitting, or whatever.  Until we can at least be sure what  
assumptions the other side is making, we'll just keep talking past each other. 
My hope is that those of few who've persevered this far into my posting will 
be able to communicate with the other side just a little bit better.

[ Home ] [ Library ] [ Bookstore ] [ Contact ] [ Search ]

Last Updated: Contact webmaster@stathead.com for corrections or problems

Copyright 1997-2001 by Keith Woolner. All included authors retain the copyrights to their original works.