"Support-Neutral" Statistics -- A Method of
	   Evaluating the True Quality of a Pitcher's Start
			  Michael Wolverton
		     870 E. El Camino Real, #168
		       Mountain View, CA  94040
appeared in _By_The_Numbers_, SABR's Statistical Analysis
subcommittee's newsletter, Volume 5, Number 4, Dec. 1993.
Motivation
----------
In recent years, we've seen the development and growing use of two
measurements designed to evaluate starting pitchers on a game-by-game
basis: Quality Starts and Game Score.  Both measures are attempting in
some way to look at the quality of each outing the starter has, rather
than looking at the average or cumulative performance over the course
of year like ERA does.  But both measures have their weaknesses as
total measures of pitching performance.
The arguments against Quality Starts are well known.  Detractors point
out that the worst qualifying outing -- 6 innings and 3 earned
runs -- is not "quality" at all. A related objection is that Quality
Starts makes no attempt to quantify the degree of quality a start
has -- 6 innings, 3 runs is the same as 8 innings, 2 runs which is the
same as a 9-inning shutout.
Partly in answer to these objections, Bill James developed the Game
Score, which combines a starter's box score numbers (IP, H, ER, R, BB,
K) using weights, where the weights are assigned such that the league
average score is around 50, the best imaginable score is around 100,
and the worst imaginable score (by someone outside the state of
Colorado) is around 0. Game Score is acknowledged as an interesting
measure of "game domination" by a starter, but it has weaknesses as a
total measure of starter quality (i.e., his contribution to team
victories): it's too dependent on strikeouts, possibly too dependent
on hits and walks (after all, the number of runs given up is really
the only thing that matters), and it isn't park-adjusted.
Despite the weaknesses of these two measures, looking at a pitcher's
starts game-by-game is still a good idea.  Looking at each start's
contribution to winning, rather than cumulative run-prevention over
the course of a year (ERA or Pitching Runs), can help us answer
questions like: Given equal ERAs, do some pitchers pitch in a way that
will tend to win more games than other pitchers? In particular, is it
better for a starter to be flaky --  either very good or very bad on a
given day -- or consistently average?  Does the park have a smaller
influence on the value of the start when the start is very good or
very bad?
So here's what we'd like out of a stat measuring the quality of a
start:
 - it should depend only on numbers appearing in a box score.
 - it should be independent of a pitcher's support, both from his
   team's offense and from his team's relievers.
 - it should be park-adjusted.
 - the resulting measurement should be in terms of some kind of
    meaningful unit, such as games or runs, rather than being a unitless
    index (and, ideally, it should be obvious to any baseball fan what a
    good or bad score in those units is).
 - most importantly, it should reflect the contribution that a start
   had toward winning the game.
I've developed a couple of measurements that meet these five
requirements. (Actually, the ideal stat would also be very easy to
compute, but hey, 5 out of 6 isn't bad, right?).  Support-Neutral Wins
and Support-Neutral Losses (SNW and SNL) measure the expected number
of wins and losses a pitcher would have with his outings, if he got
average support from his offense and his bullpen. Support-Neutral
Value Added (SNVA) measures the total number of games that an average
team would win given the pitcher's starts, over the number of games
they'd win with a league average starter.  All of these stats are
computed using only the number of innings pitched, number of runs
given up, and the park the game was pitched in.  SNVA may be a
slightly more accurate measure of a starter's actual value compared to
league average, but the SNW/L record has the advantage of being
flexible and more understandable. Both of them, in my opinion,
constitute an improvement over Thorn and Palmer's Pitching Runs as a
total measure of starter worth.
Support-Neutral Wins and Losses
-------------------------------
Support-Neutral Wins is calculated by determining the probability that
a pitcher would get the win for each start he has, and then summing up
the individual probabilities over all of his starts. The sum gives you
the number of wins a pitcher could expect to get for an average team,
given his performances.  A "performance" here consists only of the
number of innings pitched, the number of runs (not earned runs) given
up, the park in which the game was played, and whether the pitcher was
at home or on the road -- SNW assumes that these are the only things
which influence whether the pitcher wins or loses.
The rest of this section describes the formulas that are used to
calculate SNW; readers who aren't interested in the specific methods
of calculation are welcome to skim or skip to the next section.
To calculate the probability that a pitcher wins the game, we just
need to look at the definition of a win: A starting pitcher wins the
game if his team has the lead when he's taken out of the game, and
they never relinquish that lead.  So, for a given outing by the
starter, the probability that he gets the win is just the probability
that his team will take the lead (score more runs than the starter
gives up) by the time he's removed times the probability that they'll
hold that lead until the game is over.
To put this into a formula, we just need to determine and add up the
probabilities of all the different ways his team can take and hold a
lead:
      SNW(i, r) = sum [j = (r+1) to INFINITY] of
                     PScore(i, j) * PHold(j-r, 9-i),
where
SNW(i, r) is the probability a starter who goes i innings and gives up
r runs will get the win, given an average team playing behind him.
PScore(i, r) is the probability that an average team will score r runs
in i innings.
PHold(k, i) is the probability that an average team will hold a k-run
lead (without ever relinquishing it) for the i remaining innings until
the end of the game.
The above formula is actually a simplification of the formula I use in
my software to calculate SNW (I'll refer to the formula in my software
as the "real" SNW formula).  In order to make it easier to explain, I
made a few assumptions to get the formula above.  First, that formula
assumes that the starter comes out of the game after pitching a full
inning (i.e., he pitches no extra thirds of an inning).  The formula
is complicated somewhat when thirds of an inning are taken into
account, but the same general idea applies: his team must be leading
when he comes out, and his team must hold the lead for the extra
thirds in the inning he leaves, plus all the rest of the remaining
innings.  The real SNW formula does take thirds of an inning into
account.
Second, the above formula doesn't explicitly take the park into
account.  To take park effects into account, we need to make SNW,
PScore, and PHold be functions of the park in which the game is
played.  A hitter's park should inflate the probabilities that an
average team will score a high number of runs, and a pitcher's park
should do the opposite.  The real SNW formula does take park into
account.  I talk a little more about my handling of park effects in
the Appendix.
Third, the above formula doesn't take into account whether the starter
is pitching at home or on the road.  Maybe contrary to intuition, this
does make a difference.  Consider a starter who leaves after pitching
the 7th inning: if he's at home, he's pitched the top of the 7th, so
he gets credit for the runs his team scored in the first 6 innings,
plus the runs they score in the bottom of the 7th; if he's on the
road, however, he pitched the bottom of the 7th, so he gets credit for
the runs his team scored in the first 7 innings, plus the runs they
score in the top of the 8th.  So, all other things being equal, it's
easier for pitchers to get wins (and harder for them to get losses)
when they pitch on the road.  The formula above is for a pitcher
pitching at home, and the road formula is slightly different.  The
real SNW formula does take home/road status into account.
Finally, the above formula doesn't quite reflect the full definition
of a pitcher's win -- a starter can't get the win unless he goes 5
innings or more.  Presumably, this extra condition was put into the
win rule to reduce the number of undeserving starters getting lucky
wins. But when you're assigning fractions of a win, rather than 1 win
or 0 wins, there's no possibility of getting lucky. So, the real SNWL
formula does not take the five- inning condition into account,
although, for the purposes of comparison, I do calculate an expected
win (E(W)) number which is equal to 0 if the pitcher goes less than 5
innings and equal to SNW otherwise.
Let's finish off the formula above. PScore is easy to find
recursively, provided you know an average team's single-inning scoring
distribution, PInningScore:
      PScore(i, r) = sum [j = 0 to r] of
                        PInningScore(j) * PScore(i-1, r-j),   i > 1
      PScore(1, r) = PInningScore(r)
where
PInningScore(r) is the probability that an average team will score r
runs in an inning.
PHold is a little more complicated, since you have to see to it that
the pitcher's team never relinquishes the lead.  Still, it's not too
hard to reduce it to the following (below, "tr" stands for the number
of runs the pitcher's team scores in an inning, and "or" stands for
the number the opposing team scores in an inning):
     PHold(k, i) = sum [tr = 0 to INFINITY] of
                      sum [or = 0 to k + tr - 1] of
                         PInningScore(tr) * PInningScore(or) * 
                         PHold(k+tr-or, i-1),   i > 0
     PHold(k, 0) = 1
The only remaining unknown is the single-inning scoring distribution,
PInningScore.  But that's readily available from linescores of past
games.  The scoring distribution (separate distributions for each
league) I'm using right now was taken from a few weeks of linescores
in USA TODAY from late-April and early-May of 1992.  I'll probably be
able to get a more accurate distribution someday, but I'm sure that
this one is close enough.
The SNL value for a single start is calculated analogously to SNW. 

Support-Neutral Value Added
---------------------------
SNW and SNL gives us a nice way of getting a "fair" W/L record for a
starter, which can then be used to compare to his actual W/L record,
or a replacement-level winning percentage, etc. (see the Results
section).  But these numbers calculate how likely it is that the
pitcher will win or lose the game -- i.e., get the "W" or "L" next to
his name in the box score.  A related but slightly different notion is
the likelihood that the team will win when a pitcher takes the mound.
In measuring the starter's contribution to team victories, we'd like
to evaluate how much the outing by the starter changes the team's
chance of winning from what it was at the beginning of the game (which
I'll assume to be 50%).  This is what SNVA is designed to measure.
Not surprisingly, the formula for SNVA looks pretty similar to the
formula for SNW:
      SNVA(i, r) = -0.5 + 
                   sum [j = 0 to INFINITY] of
                      PScore(i, j) * PATWin(j-r, 9-i)
where
SNVA(i, r) is the difference between an average team's chance of
winning after the starter has left after pitching i innings and giving
up r runs, and their chance of winning at the beginning of the game
(50%).
PScore(i, r) was defined above
PATWin(r, i) is the chance that an average team will eventually win
the game given that there are i innings left and the difference
between their score and their opponents' score is r.
Also not surprisingly, PATWin looks a lot like PHold:
      PATWin(r, i) = sum [tr = 0 to INFINITY] of
                        sum [or = 0 to INFINITY] of
                           PInningScore(tr) * PInningScore(or) *
                           PATWin(r+tr-or, i-1) ,       i > 1
      PATWin(r, 0) = 1,                                 r > 0
      PATWin(0, 0) = 0.5
      PATWin(r, 0) = 0,                                 r < 0
What SNVA gives us (when summed over all a pitcher's starts) is the
number of games in the standings he's worth to his team above the
average starter. Of course, this is exactly the same unit (games above
the average player) that all of Total Baseball's[1] measurements are
in. So it'll be interesting to compare SNVA to Thorn and Palmer's
Adjusted Pitching Runs to see how well they correlate and also where
the differences lie.
Results
-------
Best, worst, luckiest, and unluckiest starters of 1992
------------------------------------------------------
That's enough of the gory details of the calculation of the stats.
Let's look at the fun stuff -- what the stats tell us about real
pitchers.  I tracked all starting pitchers in the majors over the 1992
season, and Tables 1 and 2 show the top pitchers in both leagues for
1992.  Each table shows the pitcher's Support-Neutral Wins (SNW),
Losses (SNL), and Winning Percentage (SNPct), followed by his actual
win-loss record (W, L), his runs allowed per 9 innings (RA), his
Adjusted Pitching Runs(*1) (APR), and his Support-Neutral Value Added
(SNVA).  Interestingly, Greg Maddux, with the fabulous year he had
pitching in Wrigley, was the only pitcher in either league who came
close to "deserving" to win 20 games.


Pitcher       Team    SNW   SNL  SNPct    W  L    RA     APR   SNVA
--------------------------------------------------------------------
Mussina        BAL   17.2   7.8   .688   18  5   2.61   47.0   4.60
Clemens        BOS   17.5   8.5   .674   18 11   2.92   43.8   4.39
Appier         KCR   15.2   6.6   .698   15  8   2.55   42.6   4.08
Guzman,Ju      TOR   13.4   6.4   .679   16  5   2.79   32.3   3.34
Nagy           CLE   16.3   9.9   .623   17 10   3.25   33.3   3.11
Eldred         MIL    8.2   2.4   .776   11  2   1.88   28.1   2.81
McDowell       CHI   16.3  10.7   .602   20 10   3.28   30.5   2.53
Smiley         MIN   16.0  10.5   .603   16  9   3.47   28.3   2.75
Navarro        MIL   15.8  10.8   .595   17 11   3.59   22.8   2.45
Abbott,J       CAL   13.6   8.6   .612    7 15   3.11   27.7   2.36
Viola          BOS   15.8  11.2   .586   13 12   3.74   21.4   2.35
Fleming        SEA   15.1  10.7   .586   17 10   3.73   19.7   2.10
Perez,M        NYY   14.9  10.5   .586   13 16   3.42   26.3   1.90
Wegman         MIL   15.6  11.5   .576   13 14   3.58   24.4   2.06
Erickson       MIN   13.3   9.8   .574   13 12   3.65   20.8   1.75
Bosio          MIL   14.3  11.1   .563   16  6   3.89   13.6   1.52
Key            TOR   13.3  10.4   .561   13 13   3.66   18.0   1.42
Brown,K        TEX   15.5  12.9   .545   21 11   3.96   14.5   1.18
Welch          OAK    8.0   5.8   .580   11  7   3.42    9.7   0.91
Rasmussen      KCR    3.0   0.8   .785    4  1   1.67   11.4   1.09
Table 1: Top 20 AL Starters in 1992, ranked by SNW-SNL




Pitcher       Team    SNW   SNL  SNPct    W  L    RA     APR   SNVA
--------------------------------------------------------------------
Maddux,G       CHI   19.5   7.4   .724   20 11   2.28   53.9   5.75
Tewksbury      STL   16.1   7.3   .687   15  5   2.45   38.5   4.12
Schilling      PHI   13.9   6.8   .670   12  9   2.59   31.1   3.37
Morgan         CHI   16.3   9.5   .632   16  8   3.00   30.4   3.22
Rijo           CIN   13.9   8.1   .632   15 10   2.86   28.5   2.57
Smoltz         ATL   16.6  11.0   .601   15 12   3.28   25.1   2.67
Glavine        ATL   15.1   9.8   .608   20  8   3.24   23.9   2.71
Martinez,D     MON   14.5   9.1   .613   16 11   2.98   24.0   2.50
Swindell       CIN   13.8   8.5   .619   12  7   3.05   24.5   2.56
Swift          SFG   10.4   5.1   .670    9  3   2.36   23.6   2.51
Drabek         PIT   15.9  10.8   .595   15 11   2.95   26.7   2.32
Fernandez,S    NYM   13.6   8.8   .608   14 11   2.81   24.7   2.25
Hill           MON   14.0  10.0   .583   16  9   3.14   19.3   1.93
Leibrandt      ATL   13.5   9.5   .586   15  7   3.68   11.8   2.02
Smith,P        ATL    5.6   2.1   .724    7  0   2.22   16.2   1.69
Wakefield      PIT    6.4   3.3   .656    8  1   2.54   13.8   1.42
Rivera         PHI    6.5   3.7   .639    7  3   2.95   10.8   1.34
Benes          SDP   14.0  11.3   .553   13 14   3.50   10.7   1.22
Portugal       HOU    6.6   4.0   .621    5  3   2.69   12.5   1.18
Avery          ATL   14.0  11.5   .549   11 11   3.66   14.8   1.15
Table 2: Top 20 NL Starters in 1992, ranked by SNW-SNL

On the flip-side, Tables 3 and 4 show the worst(*2) 10 starting pitchers
in 1992 for each league.  Not surprisingly, many of these guys showed
up in different uniforms in 1993, several on expansion teams.

Pitcher       Team    SNW   SNL  SNPct    W  L    RA     APR   SNVA
--------------------------------------------------------------------
Armstrong      CLE    5.2  11.5   .313    3 15   6.37  -28.4  -3.08
Milacki        BAL    4.5   9.5   .320    6  8   6.18  -21.8  -2.32
Terrell        DET    2.9   7.5   .280    3  6   6.98  -22.9  -2.26
Slusarski      OAK    2.5   6.9   .265    5  5   6.25  -18.7  -2.05
Sanderson      NYY    9.9  14.0   .414   12 11   5.40  -22.4  -2.02
Aldred         DET    2.4   6.5   .273    2  7   7.63  -21.7  -1.89
McCaskill      CHI   10.2  14.2   .417   12 13   5.00  -16.1  -1.92
Wells          TOR    3.4   6.9   .332    6  7   7.70  -27.7  -1.81
Stieb          TOR    3.4   6.7   .337    3  6   5.92  -13.3  -1.50
Otto           CLE    3.9   7.2   .354    5  9   6.75  -19.8  -1.57
Table 3: Bottom 10 AL Starters in 1992, ranked by SNW-SNL

Pitcher       Team    SNW   SNL  SNPct    W  L    RA     APR   SNVA
--------------------------------------------------------------------
Bowen          HOU    0.6   6.1   .094    0  7  12.22  -31.3  -2.61
Wilson,T       SFG    7.0  11.3   .384    8 14   4.79  -18.5  -2.03
Abbott,K       PHI    4.5   8.3   .352    1 14   4.92  -11.4  -1.84
Martinez,R     LAD    7.3  11.1   .397    8 11   4.90  -19.1  -1.84
Henry,B        HOU    8.3  11.7   .414    6  9   4.40  -12.4  -1.57
Young,A        NYM    2.8   6.2   .313    1  7   5.79  -16.8  -1.63
Black          SFG    8.6  11.9   .420   10 12   4.47  -14.7  -1.54
Hershiser      LAD   10.2  13.3   .434   10 15   4.31  -12.5  -1.60
Hammond        CIN    7.0  10.0   .409    7 10   4.61   -7.4  -1.36
Blair          HOU    1.4   4.5   .241    1  5   7.51  -16.8  -1.52
Table 4: Bottom 10 NL Starters in 1992, ranked by SNW-SNL

This method also allows you to evaluate the level of luck a pitcher
experienced in his W/L record -- i.e. it allows you to look at how
much a pitcher's actual W/L record differs from his expected W/L
record given the way he pitched.  Tables 5 through 8 show the luckiest
and unluckiest starters in each league in 1992.  No one should be
surprised that Jack Morris, who compiled a 21-6 record despite a 4+
ERA, was far and away the luckiest starter in either league last year.
SNW/L evaluation shows that you'd expect his 1992 performance to
produce a 13-13 mark if he had gotten average support.  Equally
unsurprising is the result that Jim Abbott was the unluckiest pitcher
in either league.  The Angels gave him enough support only for a
miserable 7-15 record, while his pitching actually merited something
closer to 13-9.
Pitcher       Team    E(W)  E(L)    W  L   Diff.
------------------------------------------------
Morris         TOR   13.3  13.1    21  6   14.7
Brown,K        TEX   15.5  12.9    21 11    7.4
Moore          OAK   12.1  14.3    17 12    7.3
Bosio          MIL   14.2  11.1    16  6    6.9
Hibbard        CHI    8.1  11.3    10  7    6.2
Darling        OAK   11.5  12.5    15 10    6.0
Sanderson      NYY    9.7  14.0    12 11    5.3
Wickman        NYY    2.7   2.9     6  1    5.2
Slusarski      OAK    2.3   6.9     5  5    4.6
McDowell       CHI   16.2  10.7    20 10    4.5
Table 5: Luckiest 10 AL Starters in 1992, ranked 
by W-E(W)+E(L)-L

Pitcher       Team    E(W)  E(L)    W  L   Diff.
------------------------------------------------
Abbott,J       CAL   13.3   8.6     7 15  -12.7
Perez,M        NYY   14.6  10.5    13 16   -7.0
Hanson         SEA    9.6  12.8     7 17   -6.8
Armstrong      CLE    5.2  11.5     3 15   -5.8
Wegman         MIL   15.6  11.5    13 14   -5.1
Valera         CAL   10.4   9.6     7 11   -4.8
Kamieniecki    NYY    8.8  12.0     6 14   -4.7
Ryan           TEX    9.3   8.7     5  9   -4.6
Chiamparino    TEX    1.5   1.3     0  4   -4.2
Reed           KCR    5.1   6.0     2  7   -4.1
Table 6: Unluckiest 10 AL Starters in 1992, 
ranked by W-E(W)+E(L)-L

Pitcher       Team    E(W)  E(L)    W  L   Diff.
------------------------------------------------
Burkett        SFG    9.6  13.0    13  9    7.5
Glavine        ATL   15.0   9.8    20  8    6.8
Seminara       SDP    5.1   6.3     9  4    6.2
Lefferts       SDP    8.4  10.4    13  9    6.0
Tomlin         PIT   11.5  11.3    14  9    4.8
Hurst,B        SDP   12.4  12.1    14  9    4.7
Cone           NYM   11.3   9.9    13  7    4.6
Leibrandt      ATL   13.2   9.5    15  7    4.3
Osborne        STL    9.2  11.4    10  8    4.2
Wakefield      PIT    6.3   3.3     8  1    4.0
Table 7: Luckiest 10 NL Starters in 1992, ranked 
by W-E(W)+E(L)-L

Pitcher       Team    E(W)  E(L)    W  L   Diff.
------------------------------------------------
Abbott,K       PHI    4.5   8.3     1 14   -9.2
Candiotti      LAD   11.8  10.5    10 15   -6.3
Gross,Ke       LAD   10.9  10.9     8 13   -5.0
Clark,M        STL    5.4   8.0     3 10   -4.4
Schilling      PHI   13.9   6.8    12  9   -4.0
Benes          SDP   13.8  11.3    13 14   -3.5
Carter         SFG    1.5   2.4     1  5   -3.1
Boskie         CHI    3.2   7.1     3 10   -3.1
Maddux,G       CHI   19.5   7.4    20 11   -3.1
Whitehurst     NYM    2.3   3.3     1  5   -3.0
Table 8: Unluckiest 10 NL Starters in 1992, 
ranked by W-E(W)+E(L)-L


League total numbers
--------------------
In theory, the support-neutral record of the entire league should come
close to the actual win-loss record of the league, and in fact, in
1992, SNW/L did appear to predict league W/L pretty well.  Table 9
shows both the expected and actual W/L totals for each league in 1992.
The National League's record corresponded very well to the record
expected by the model, with no-decisions being underpredicted only
slightly by SNW/L.  The American League is predicted a little less
successfully -- there were nearly 30 more wins in the league than
expected, and nearly 10 more losses than expected.  I believe that
part of the discrepancy between expected record and actual record can
be explained by the fact that relief pitchers prevented runs better
than starters in 1992.  Since starters are competing for the (actual)
decision primarily with the other starter, it makes sense that
starters would get a few more (actual) wins than predicted by a model
which has them competing with league average pitching for the
decision.
	E(W)	E(L)	E(Pct.)	W	L	Pct.
NL	660.9	690.3	.489	655	678	.491
AL	776.1	846.7	.478	805	837	.490
Table 9: Expected and Actual records of all starters in the leagues

Value of "flaky" and "steady" pitchers
--------------------------------------
Do the Support-Neutral stats tell us anything that Thorn and Palmer's
Adjusted Pitching Runs weren't already telling us? Since both APR and
SNVA are trying to measure exactly the same thing (albeit by different
methods), we'd expect there to be a pretty strong correlation between
them.  There is.  For most pitchers, SNVA (whose unit is "games above
average") is approximately equal to one-tenth of APR (whose unit is
"runs above average").  This is what you'd expect given the well-known
result that each 10 runs prevented (or gained) leads on average to
about 1 extra win in the standings (see, e.g., [2]).  However, there
are plenty of cases where APR and SNVA give significantly different
evaluations. Look at the 1992 records of Charlie Leibrandt and Melido
Perez:
            APR    SNVA
-----------------------
Leibrandt  11.8    2.02
Perez,M    26.3    1.90
APR evaluates Perez as being 14.5 runs -- about one-and-a-half
games -- better than Leibrandt.  However, SNVA shows that, when the
pitchers' performance is evaluated game-by-game, Leibrandt was
actually a little better than Perez.
The key to this discrepancy between the two measurements is found in
the amount of consistency the two pitchers exhibited in their starts.
Perez was a model of consistency last year; he rarely got bombed, but
he also was rarely dominating.  Leibrandt, on the other hand, was one
of the least consistent pitchers in the majors.  And that is the most
surprising result I've seen so far from these SN stats: run-prevention
stats such as ERA and APR tend to undervalue flaky pitchers, and
overvalue consistent ones, at least when you consider them pitching
for an average team.  Tables 10 through 13 show the "flakiest" (most
inconsistent) and "steadiest" (most consistent) pitchers in the
leagues last year, as evaluated by the variance of the SNVA of their
individual starts.  You can see from those tables that APR pretty
consistently underestimates a pitcher's value when the pitcher is
flaky, and pretty consistently overestimates his value when he's
steady.  9 of the 10 flakiest pitchers in both the NL and AL were
underestimated by APR, and 8 of the 10 steadiest in the NL and 10 of
the 10 steadiest in the AL were overestimated by APR.  And the
pitchers for whom there were really large discrepancies between APR
and SNVA --  Leibrandt, Kyle Abbott, Gooden, Hammond, Sutcliffe,
Perez, Kamieniecki, McDowell --  all showed up near the top of the
predicted list.
The reason for this undervaluing is that APR counts all runs as equal,
while in fact all runs do not contribute an equal amount toward
winning/losing a game. In particular, Bill James did a study that
showed that runs scored by a team after they've already scored 5 in a
game do not contribute the same amount toward the probability of
winning than those first 5 runs did[3]. So, pitchers who give up more
than 5 runs in a couple of games will be undervalued by ERA and APR,
because those really crummy outings probably weren't quite as crummy
as ERA and APR would have you believe.
Pitcher       Team    APR   SNVA   SNVA Var.
--------------------------------------------
Smith,Z        PIT    3.7   0.70    0.088
Smoltz         ATL   25.1   2.67    0.083
Saberhagen     NYM    4.5   0.73    0.082
Leibrandt      ATL   11.8   2.02    0.082
Osborne        STL  -12.6  -0.97    0.079
Glavine        ATL   23.9   2.71    0.076
Hurst,B        SDP   -1.5   0.12    0.075
Cone           NYM    8.5   0.67    0.074
Belcher        CIN    1.9   0.53    0.074
Benes          SDP   10.7   1.22    0.068
Table 10: Flakiest 10 NL Starters in 1992, 
ranked by variance of SNVA (15 starts 
minimum)

Pitcher       Team    APR   SNVA   SNVA Var.
--------------------------------------------
Abbott,K       PHI  -11.4  -1.84    0.022
Rijo           CIN   28.5   2.57    0.032
Browning       CIN   -8.7  -1.17    0.035
Gooden         NYM   -6.1  -1.33    0.036
Hammond        CIN   -7.4  -1.36    0.041
Tewksbury      STL   38.5   4.12    0.042
Maddux,G       CHI   53.9   5.75    0.042
Fernandez,S    NYM   24.7   2.25    0.043
Boskie         CHI  -12.5  -1.47    0.044
Gardner        MON   -9.5  -1.20    0.044
Table 11: Steadiest 10 NL Starters in 1992, 
ranked by variance of SNVA (15 starts 
minimum)

Pitcher       Team    APR   SNVA   SNVA Var.
--------------------------------------------
Sutcliffe      BAL   -8.3  -0.33    0.089
Smiley         MIN   28.3   2.75    0.078
Krueger        MIN   -0.1   0.14    0.078
Johnson,R      SEA    1.7   0.24    0.077
Gubicza        KCR    7.3   0.78    0.075
Langston       CAL    5.6   0.77    0.073
Fleming        SEA   19.7   2.10    0.073
Viola          BOS   21.4   2.35    0.073
Rhodes         BAL    6.7   0.77    0.071
Darling        OAK   -4.8  -0.31    0.070
Table 12: Flakiest 10 AL Starters in 1992, 
ranked by variance of SNVA (15 starts 
minimum)
Pitcher       Team    APR   SNVA   SNVA Var.
--------------------------------------------
Armstrong      CLE  -28.4  -3.08    0.034
Darwin         BOS    8.2   0.45    0.036
Milacki        BAL  -21.8  -2.32    0.037
Kamieniecki    NYY   -8.9  -1.57    0.038
Perez,M        NYY   26.3   1.90    0.039
Reed           KCR   -0.3  -0.20    0.040
Appier         KCR   42.6   4.08    0.040
Cook           CLE   -3.6  -0.56    0.042
Hibbard        CHI  -10.7  -1.28    0.045
McDowell       CHI   30.5   2.53    0.045
Table 13: Steadiest 10 AL Starters in 1992, 
ranked by variance of SNVA (15 starts 
minimum)
As an example of this, consider a David Wells outing from 1992: he
gave up 13 runs in 4+ innings.  APR just subtracts his 13 runs from
the number of runs a league average pitcher would have given up in
those same 4 innings (about 2), and concludes that Wells was worth
about -11 runs, or -1.1 games, in that start.  Did Wells really cost
the Blue Jays more than a game in the standings with that awful start?
Of course not.  He guaranteed them a loss, of course, but they had
some chance of losing the game to begin with anyway -- about a 50%
chance if you make the simplifying assumption that they're an average
team.  SNVA gives a far more reasonable value for Wells's start: it
was worth about -0.5 games.  That's as much as a single start can cost
you.  Wells didn't have the requisite 15 starts to show up in Table
12, but you can see from his record in Table 3 how much he was
underestimated by APR.

Effect of the park on win probability
-------------------------------------
One other question I've been looking at is how the value of starts is
influenced by park effects.  Figure 1 shows the SNVA for a 9-inning
complete game in both Wrigley Field (an extreme hitters' park) and the
Astrodome (an extreme pitchers' park).  We can see from the figure
that the effect of the park on the value of the start is far less at
the two extremes of start quality than it is for middle-of-the-road
starts.  The difference between Wrigley and the Astrodome for the
value of a 9-inning, 5-run start is about four times as large as the
difference between Wrigley and the Astrodome for the value of a
shutout.
[If this were PostScript, there'd be a graph here]
Figure 1: SNVA for Wrigley Field (top line)
and the Astrodome (bottom line), given that
the starter pitched 9 innings
This would imply that methods of park adjustments which simply
multiply a pitcher's "raw" value by a park factor might be over- or
underestimating the park's actual effect on his value.  Since the
park's effect on very good or very bad starts is much less than on
average starts, a reasonable hypothesis would be that very good or
very bad pitchers deserve less of a boost (or less diminishment) to
their rating than current park adjusment methods give them.
However, the preliminary investigation of this hypothesis I have done
on real starting pitchers (with 1992 data) has failed to find much
support for it.  I'd still like to do some more work on this issue.

Weaknesses of the Approach
--------------------------
Here are a few of the problems with these measurements:
 - They assume that scoring distributions of an inning are independent
   from the distributions of surrounding innings.
 - They (like most other measures of pitching) don't account for
   situational pitching.  A pitcher who gets a big lead is likely to
   start throwing all fastballs, and he may give up a few meaningless
   runs that he wouldn't have given up without the big lead.  I'm not too
   worried about this, because I don't think those big-lead situations
   are common enough for anybody to make much of a difference.
 - They don't account for differences in the ways pitchers are used by
   their managers.  Some pitchers get left in the game to get pounded,
   some are routinely yanked early, etc.  Note however that SN stats do a
   better job than other methods of mitigating the manager's effect.  If
   Cito Gaston leaves David Wells in the game to give up 13 runs, SNVA
   produces a rating which is not much different than if Gaston had
   yanked Wells after giving up "only" 7 or 8 runs.
 - They don't account for the defense playing behind the pitcher.
   Suffice it to say that this is a very hard problem.

Conclusion
----------
I've presented Support-Neutral Wins, Losses, and Value Added, three
park- and league- adjusted measurements of the value of individual
starts, and of starting pitchers.  I feel these are a valuable
addition to existing measurement methods, both because they can
provide a measurement of pitcher worth in units which are familiar to
all baseball fans (pitcher wins and losses) and because they seem to
be a slightly more accurate measure of the true value of a start than
existing methods.
Special thanks to Greg Spira, whose discussion sparked many of the
ideas presented here.  Thanks to David Tate and others on the Internet
newsgroup rec.sport.baseball, who provided valuable feedback on the
method.  And thanks to my wife, Cindy, for reading this paper and
giving me many useful suggestions.

References
----------
[1] Thorn, J. and Palmer, P. (eds.), Total Baseball, 3rd edition,
Harper Collins, New York, 1993.
[2] Thorn, J. and Palmer, P., The Hidden Game of Baseball, Doubleday
Books, New York, 1985.
[3] James, B., The 1986 Bill James Baseball Abstract, Ballantine
Books, New York, 1986, pp. 172P175.

Appendix: Park Effects
----------------------
One possible way of incorporating park effect numbers into these
measurements would be to take whatever final value the above formulas
produce (SNW, SNL, or SNVA) and multiply it by some park effect
constant for the pitcher's home park. This is essentially the approach
Thorn and Palmer use in Total Baseball. But the method of calculating
the Support-Neutral stats allows a potentially more informative use of
park effects. Since park effects (as printed in Elias, e.g.) reflect
how a park inflates or deflates average scoring ability, it makes
sense to have the "average team" playing behind the pitcher effected
by the park, and then calculate the likelihood that the pitcher's
outing plus this park-adjusted average team will lead to a win. So for
any game, the PInningScore (league average scoring) distribution is
adjusted to reflect the park's effect on run scoring. The resulting
number then reflects the park's effect on winning rather than
cumulative run scoring/prevention.
The question then becomes: how do you translate a single park effect
percentage like the ones in Elias (the only source of park effects I
have) into an adjusted PInningScore distribution?  There are an
infinite number of ways to do this.  The way I'm doing it now is to
change the probability of scoring 0 runs by one factor, and change the
probability of scoring i runs for i>1 by another factor, such that the
total number of expected runs scored in an inning is increased/reduced
by the Elias number.  For example, if the Astrodome decreases scoring
by 10%, I increase PInningScore(0) for the Astrodome by one factor,
and decrease PInningScore(i) for i>1 by another factor, such that the
expected single-inning score reflected by PInningScore is reduced by
10% from the park-neutral scoring distribution.  If that isn't clear
(and I'm sure it isn't), I should say that I don't think it makes much
difference the exact method used.


*1 Adjusted Pitching Runs is the basic metric which Thorn and Palmer
(the authors of Total Baseball) use to evaluate pitchers.  APR is the
number of runs prevented by a pitcher that a league average pitcher
would've given up. The APR that I'm using in this paper differs from
Thorn and Palmer's statistic in two ways: 1) I'm using runs where
Thorn and Palmer use earned runs, and 2) the method of park adjustment
I use is a simplification of the one used in Total Baseball.  It is
included here for comparison with SNVA.  
*2 Actually, it's probably inaccurate to use the word "worst" here,
since the method of ranking the pitchers -- ranking them according to
SNW-SNL -- sets the baseline for comparison at league average (anyone
below .500 gets a negative rating).  Of course, it's quite possible
for a below-average pitcher to still be valuable to his team.  A
better method of producing this list might have been to compare a
pitcher's SN record to a lower baseline, e.g., a .450 pitcher.  This
would have left pitchers like Hershiser and McCaskill, who pitched a
lot of innings at somewhat below-league-average performance, off of
the lists in favor of other pitchers who pitched fewer innings but at
further-below-average performance.