[ Home ] [ Library ] [ Bookstore ] [ Contact ] [ Search ]
by Keith Woolner
Marginal Lineup Value (or MLV) is a method of estimating the effect a batter's offensive performance has on his team's run scoring, compared to some baseline. It was developed by David Tate in 1992 and posted to rec.sport.baseball. The following description and explanation of MLV was written and edited by Keith Woolner, from David's original postings, and appears with David's permission. The extensions of MLV to positional averages, and replacement level were written by Keith Woolner:
Most sabermetric measures of offensive performance look at individual performance to estimate a player's value, often times extrapolating measures that correlate with run scoring at the team level to measure individual performance out of context. This can lead to discrepancies between actual contribution and the model's estimate, particularly since the range of player ability and rates of production are much wider than those among entire teams.
In the case of Bill James' popular measure, Runs Created (RC), the basic RC function (Runs ~= OBP * TB) was developed and validated against historical team data, where is provides a good relationship between a team's components of offense (reaching base, via OBP; and ower, via Total Bases or SLG) and the team's overall run scoring.
However, when you try to isolate a single player's contribution in this fashion (that is, estimating a player's value by his own OBP*TB), not only do you exceed the range of inputs the model was validated against (i.e. Frank Thomas may have a .487 OBP and a .729 SLG, but no team in history has approached anywhere near that level), but you also end up with a measure that, when a player's RC is added to all his teammates individual RC, doesn't add up to team run-scoring.
The reason for the discrepancy is really quite simple. In the team RC relationship, the OBP and TB (or SLG, for a rate of production) represent, roughly, a team's overall ability to create baserunners (OBP), and advance them along towards scoring (SLG). Similarly, individual RC combines a player's own ability to become a baserunner (his own OBP), and his ability to advance baserunners (his own SLG). However, when you consider a single player's contribution in a team, context, the batter is almost always advancing runners other than himself, and is being driven in by a batter other than himself. Home runs are an exception to this, of course, as he both reaches base and scores by himself, but otherwise he is depending to some degree on the contribution of his teammates to either set the table for him (by reaching base ahead of him), or the finish the job (by knocking him in). The player's contribution to scoring depends on his team's (or, to generalize to team-independency, an average team's) ability to capitalize on the production the player brings to the plate. RC, then, overestimates the value of high-OBP, high-SLG batters (like Thomas) by giving them "double credit" for "driving himself in".
Another way to think about this is that the individual RC measure, in effect, projects how many runs a lineup of 9 player clones (for example, 9 Frank Thomases) would score, whereas more realistically, you want to know how many more runs an average team would score by replacing one of the average players with Frank Thomas.
How, then, do we model Thomas's impact on the entire team's run scoring? This is the question MLV was designed to answer.
Marginal Lineup Value (MLV) follows as the mathematical derivation of the premise(s):
Let's assume for a moment that the player in question plays every single game during the season, and accumulates exactly 1/9th of the team's late appearances (this is roughly equivalent to saying that the player is hitting fifth, although we aren't explicitly modelling batting order effects here). We'll also assume that the rest of the team is composed entirely of league average hitters.
In the following deriviation,
We also assume a 162 game season, and 25.5 outs per game (due to the lack of 9th innings when the home team is ahead, outs on base that don't get recorded elsewhere, etc).
To compute MLV, we first need to estimate how many runs the team would score without him in the lineup (and a league average hitter in his place). The league averages are denoted with the prefix L_ (e.g. L_AVG is the league's batting average).
Description Denoted by Computed as
Team AVG T_AVG =L_AVG
Team OBP T_OBP =L_OBP
Team SLG T_SLG =L_SLG
Team PA's T_PA =162*25.5/(1-T_OBP)
Team AB's T_AB =T_PA*(1-T_OBP)/(1-T_AVG)
Team Runs T_RUNS =T_OBP*T_SLG*T_AB
The final line in this derviation T_RUNS = T_OBP * T_SLG * T_AB is just a restatement of the team Runs Created relationship, R = OBP * TB, because TB = SLG * AB.
Now that we know how many runs the team would score without the player, we now need to estimate how many runs the team would score with the player in the lineup full-time. We do this by computing the change in the team's AVG, OBP, SLG, and PA's.
Description Denoted by Computed as
Team OBP TP_OBP =(8*L_OBP + P_OBP)/9
Team PA's TP_PA =162*25.5/(1-TP_OBP)
PA's per player TP_PA_EACH =TP_PA/9
AB's per other players TP_AB_EACH =TP_PA_EACH*(1-L_OBP)/(1-L_AVG)
AB's for our player TP_AB_HIS =TP_PA_EACH*(1-P_OBP)/(1-P_AVG)
Team AB TP_AB =8*TP_AB_EACH+TP_AB_HIS
Team SLG TP_SLG =(8*L_SLG*TP_AB_EACH + P_SLG*TP_AB_HIS) / TP_AB
Team Runs TP_RUNS =TP_OBP*TP_SLG*TP_AB
The formula is more complicated than the "without" case because the addition of the player changes not only the team's overall OBP and SLG, but indirectly changes each team member's total plate appearances (PA). This is one of the great benefits of a high OBP, the ability not only to reach base, but to give your teammates additional opportunities to come to the plate.
Description Denoted by Computed as
Marginal Value/Season MLV_FULL =TP_RUNS - T_RUNS
MLV_FULL is the number of runs the player would add to the team's total if he played every game for a full season. It is a rate stat, similar to AVG, OBP, or SLG, in that it tells you only the player's production over some standard amount of time. It tells you nothing about how much playing time the player actually logged (just as batting doesn't tell you how many hits the player had). However, in the same way we can say that a .350 hitter is better than a .250 hitter, we might say that a +50 MLV hitter is more productive than a -15 MLV hitter.
Just in case anyone's curious what the whole formula for MLV_FULL looks like, it's a function of 8 variables (the player's AVG, OBP, and SLG; the league's AVG, OBP, and SLG, and the number of games in the season and number of outs per game). There are 6 variables, if you hold the games and outs/game fixed (at, say, 162 and 25.5):
MLV_FULL =GAMES*OUTS*(1/9 * (8*L_OBP+P_OBP) / (9-8*L_OBP-P_OBP) *((8*L_SLG*(1-L_OBP))/(1-L_AVG) + P_SLG*(1-P_OBP)/(1-P_AVG)) - L_OBP*L_SLG/(1-L_AVG))
Now that we have a method for determining the rate of a player's offensive contribution, the next step is to determine the amount of a player's actual contribution based on his playing time. We look at the player's plate appearances. A first approximation might be to look at TP_PA_EACH in the calculations above, and prorate MLV_FULL by PA/TP_PA_EACH. However, that isn't quite right, as a player's actual PA's are determined in part by the OBP of his teammates. In other words, Kenny Lofton gets a lot more PA's on the Indians than if he were on the Twins, simply because his Indian teammates give him more opportunities to bat due to their own OBP, than if Lofton played for a lesser team. A better way to estimate the player's contributions is to give him the same percentage of PA's on the baseline team as he had with his actual team. In other words:
PA% = Player's PA / Team's PA
Since MLV implicitly assumes that a player get's 1/9th of all
PA's (11.1%), prorating MLV_FULL by PA% / 11.1% will
One slight problem with this is that a leadoff hitter will get a higher percentage of team PA's than the #9 hitter will, even if each plays every inning of every game. This method does not attempt to adjust for differences in PA's resulting for lineup position.
Alternatively, we could determine the MLV_FULL rate on a per game basis (by dividing by 162), and multiplying it by the number of games the player actually played.
This gets around the lineup position problems, but creates a problem for defensive replacements or players who frequently play less than a full game, as it does not reflect the amount of time per game that the player got. It is the author's belief that the best way to adjust for playing time is to use the PA% method.
MLV, as described above, compares a player to a league average hitter. However, the distribution of hitting talent is not equal throughout all positions. Due to the defensive difficulty of the shortstop, there are fewer good hitters capable of the fielding demands of the position. As we move closer to a complete measure of a player's value, we need to understand and correct for the systematic differences in position.
As a first approximation, we might want to know how much better a hitter a player is, compared not to a league average hitter, but to an average player for his primary position. Once the positional average AVG, OBP, and SLG are known, we can use the MLV framework to estimate how many runs a player contributed with his bat above what an average player at his position would have done by computing two MLV's, and taking their difference. Note that we still need the overall league hitting averages.
POSMLV = MLV(player) - MLV(average position player in same PA%)
One problem with comparing a player to league average is that it assumes that a league average performance has a value of zero. In reality, league average players are quite valuable as they are somewhat difficult to find. A full discussion of the development of replacement level theory is beyond the scope of this description; however, suffice it to say that baseball talent is not normally distributed at the highest levels of competition. There are far more scrub-level players available that there are average players, just as there are far more average players than there are stars. At some low level of talent, a team's management can find as many players at minimum cost to play as they wish. A regular player's value is what he brings to the team beyond what they could find cheaply in the minors or on the waiver wire. That low level of performance that is easily available is called replacement level.
Determining replacement level is not straightforward, and there are many competing ideas about where it should be set. This author's idea is to measure the performance of the league with all of the regular players (starters) removed, essentially measuring the aggregate performance of backups, injury replacements, undeveloped talent, and journeymen players. The difference between the backups' performance and the league average determines replacement level. A study done by the author of 1994 league averages suggest that backups perform at a level about 70 points of OPS (OBP+SLG) below league average, implying a replacment level 35 points below both league (or positional) average OBP and SLG (and AVG, for completeness and consistency).
However you choose to set replacement level, in order to use it in an MLV framework, you need to convert it to some level of AVG, OBP, and SLG for the position. Once you've set the replacement's rates of production, computing a player's value above replacement level is similar to computing value above positional level.
RPOSMLV = MLV(player) - MLV(replacement level position player in same PA%)
where RPOSMLV stands for Replacement-level, POSition adjusted, Marginal Lineup Value.
[ Home ] [ Library ] [ Bookstore ] [ Contact ] [ Search ]
Last Updated: Contact webmaster@stathead.com for corrections or problems
Copyright 1997-2001 by Keith Woolner. All included authors retain the copyrights to their original works.