Baseball Toaster was unplugged on February 4, 2009.
The Yankees take the field for their first intersquad game of the spring today when they host the Phillies at Legends Field in Tampa. With that, one of this offseason's burgeoning controversies will come to a head. Or rather it should have, but the key players will be on a plane to Arizona to join the USA's entry into the World Baseball Classic.
Still, despite their absence, now that Joe Torre will once again be filling out line-up cards on a daily basis there is sure to be a great deal of debate over the issue of who should bat lead-off once the season starts, Yankee captain and 2005 lead-off man Derek Jeter or the newly acquired Johnny Damon, who repeatedly described himself as the best lead-off hitter in the game after signing with the Yankees in late December. Given the bearded Boston baggage that comes with Damon and the reverence afforded Jeter, as well as the considerable lead-off skills of both men, the debate could get ugly. I'm here to nip it in the bud.
Choosing which players take the field is the most important job any manager has. Productive players can only produce on the field, while a team's 27 outs can disappear in a hurry when a manager calls the wrong number. Having chosen a starting nine, a manager can further distribute playing time within a given game by calling on pinch-hitters, pinch-runners and defensive replacements. Often overlooked, however, is his ability to distribute plate appearances via the batting order.
While there's a great deal of debate over the significance of batting order, one thing that's undeniable is its effect on playing time. Each successive spot in the order will receive approximately 18 fewer plate appearances over the course of a full season than the spot above it. This adds up to a whopping 144 plate appearances between the top and bottom spots, but the difference is largely insignificant when deciding between two consecutive spots. For example, the difference between a line-up with a .400 on-base percentage in the lead-off spot and a .300 OBP in the two-hole and a line-up with those two batters switched in the order is just 1.8 outs over a full season (.100 OBP points * 18 at-bats).
The difference between a line-up that starts Jeter-Damon and one that starts Damon-Jeter is even smaller. By the most basic logic, a line-up that puts Jeter ahead of Damon is a better line-up because of Jeter's reliably superior on-base percentage. However, based on a projection using Jeter's career OBP of .386 (his 2005 mark was .389) and Damon's road OBP from 2005 of .342, the difference between the two line-ups is a grand total of less than 0.8 outs over the course of 162 games. That's zero-point-eight, or a fraction of one out. Bear that in mind the next time you find yourself getting worked up over the top two spots in Torre's batting order.
A lot of the gab about Damon leading off has harked back to the glory days of the late-90s dynasty which featured Chuck Knoblauch batting lead-off ahead of Jeter. It's actually a very apt comparison, as the following list of on-base percentages shows:
1998:
Knoblauch - .361
Jeter - .384
1999:
Knoblauch - .393
Jeter - .438
2000:
Knoblauch - .366
Jeter - .416
In two of those three Championship seasons, Jeter's on-base percentage surpassed Knoblauch's by more than it does Damon's in my above projection. As with Damon-Jeter, arguing for a Jeter-Knoblauch order over a Knoblauch-Jeter order would have been largely pointless as the difference between the two would also have been less than a single out over the entire season. It is interesting to note, however, just how far the Knoblauch-Damon comparison extends.
When the Knoblauch-Jeter tandem was in its full glory, Knoblauch was praised for the number of pitches he saw per plate appearance. The praise hinged on the fact that it was the lead-off man's job to not only get on base, but to give his teammates a good look at the opposing pitcher's repertoire by forcing him to throw all of his pitches in the first at-bat of the game. Indeed, in Knoblauch's four seasons as the Yankee lead-off man, he saw 3.96 pitches per plate-appearance (topping out at 4.09 P/PA in 1998). Jeter saw 3.78 P/PA during those four seasons, with a top mark of 3.85 in 1998. Similarly, since leaving the Royals, Damon has seen 3.94 P/PA over five seasons (topping out at 4.13 P/PA in 2003), while Jeter has seen 3.69 P/PA over the same five season (topping out at 3.82 P/PA last year).
Finally, Damon is a superior base stealer to the Yankee version of Knoblauch. In four seasons as a Yankee, Knoblauch stole 112 bases in 149 attempts, a 75 percent success rate at a pace of 33.7 bases per 162 games. In his four seasons in Boston, meanwhile, Damon stole fewer bases (98 total, 26.6 per 162 games) but at a significantly higher success rate (82 percent). Indeed, last year, Damon stole just 18 bags, but was caught just once, making him a fantastic weapon on the bases.
With all of that out of the way, I thought it would be fun to toy with the line-up generating engines that our own Ken Arneson discussed in a pair of recent posts over on Catfish Stew. Ken's initial post was inspired by a post by Cyril Morong over at Beyond the Boxscore which deduced that, despite the accepted logic that a line-up provides a manager with a chance to reduce the number of opportunities that his starters with the lowest on-base percentages have to make outs, there are certain spots in the heart of the order in which slugging percentage actually corresponds better to run production than on-base percentage. After running a regression to determine the relative importance of on-base percentage and slugging percentagethe two key components of every hitters offensive gameto each spot in the order, Morong posted a table, which Ken then incorporated into a perl script designed to generate the ideal line-up.
Morong's post makes sense on its face. Slugging is most useful when there are runners on base to be driven home, thus it is an attribute that's more desirable in the line-up positions that are most likely to come up with men on base. That's a central tenant of line-up construction that goes back to the sandlots--load 'em up and drive 'em in--and where we get the term "clean-up hitter" from. After looking at the results of Ken's script, however, Morong's numbers result in rather non-traditional line-ups.
To wit, I had Ken run the three Knoblauch-Jeter Championship clubs through his program (after making some tough choices for each year's left fielder and designated hitter, I used each players actual OBP and SLG numbers from the given season) as well as some projected numbers for the '06 Yankees. This is what he turned up as the ideal line-up for each of the Knoblauch teams:
1998: Raines Williams ONeill Jeter Brosius Posada Martinez Davis Knoblauch
1999: Williams Jeter Ledee Davis ONeill Brosius Martinez Knoblauch Posada
2000: Jeter Justice Williams Posada ONeill Brosius Martinez Ledee Knoblauch
Getting past the complete rejection of the Knoblauch-Jeter tandem, what this tells us is that the biggest slugger on the team should bat second (Bernie in '98, Jeter in '99, Justice in '00) and, more troublingly, the lowest OBP should go in the six hole (Posada in '98, Brosius the other two years).
Let's see what happened when Ken ran my 2006 projections. I gave him two versions of the 2006 line-up, one with Bernie Williams at DH, one with Andy Phillips at DH. For the most part the OBP and SLG figures I used were career numbers. The exceptions being Williams and Sheffield, declining players for whom I rounded up their 2005 numbers, and Cano and Phillips, who lack a sufficient major league track record. I estimated Cano at a .310 OBP, assuming some natural correction on his batting average will shave ten points off his 2005 OBP, and a .450 SLG, again rounding down just slightly in anticipation of some correction in batting average after his unexpected rookie campaign. For Phillips, I looked at his triple-A stats for the last two seasons and tried to project his OBP and SLG should he hit .260 in the majors rather than .300 in the minors. The result was a .340 OBP and a .480 SLG (do I need to say it?). Here are all of the numbers:
Pos | Name | OBP | SLG |
---|---|---|---|
C | Posada | .375 | .469 |
1B | Giambi | .413 | .539 |
2B | Cano | .310 | .450 |
3B | Rodriguez | .385 | .577 |
SS | Jeter | .386 | .461 |
RF | Sheffield | .380 | .520 |
CF | Damon | .345 | .440 |
LF | Matsui | .370 | .484 |
DH | Williams | .330 | .380 |
DH | Phillips | .340 | .480 |
And here are the line-ups:
With Bernie: Giambi Rodriguez Sheffield Posada Matsui Cano Damon Jeter Williams
With Andy: Giambi Rodriguez Sheffield Posada Matsui Cano Phillips Jeter Damon
Again, the top slugger pops up in the two-hole with the lowest OBP in the sixth spot. Again note the complete rejection of the Damon-Jeter/Jeter-Damon construction.
Thanks to David Pinto, you can now have more fun with this program over at Baseball Musings, while a more traditional line-up generator (which puts the best hitter--by the wildly overrated OPS--third, the next best slugger fourth, the next two best on-base men first and second, then fills out the bottom five in descending order of slugging) can be found here. The latter sounds unnecessary, except that it allows you to chose a line-up from every player in baseball, then base the order on one of two projection systems, all via clickable menus. As for my two 2006 Yankee line-ups (using my projections), the traditional line-up generator would produce these two results:
Jeter Sheffield Rodriguez Giambi Matsui Phillips Posada Cano Damon
Jeter Sheffield Rodriguez Giambi Matsui Posada Cano Damon Williams
Which, by sheer fact of the excellence of the heart of the Yankee order, which is awash in high OBP sluggers, also puts a big-time slugger in the two-hole, an idea which seemed to be gaining popularity during the 2004 League Championship Series when three of the four second-place hitters were Alex Rodriguez, Carlos Beltran, and Larry Walker, but has since been revealed as somewhere between fluke and fad.
This all brings us back around to the question of Damon. The difference between Jeter-Damon and Damon-Jeter is an easy-to-figure 0.792 outs over 162 games, but the difference between a line-up that begins with that combination and one that pushes Damon below the murderer's row of Sheffield, Rodriguez, Giambi, Matsui and Posada, all of whom can both out-slug Damon and reach base more often than he can, is something else entirely.
To make the math simple, let's assume Rodriguez and Giambi would hit three and four in either line-up, let's then round the on-base percentages of Sheffield, Matsui and Posada all to .380. What we're doing then, in essence, is promoting one of these .380 on-base percentages from the seventh slot up to the two-hole, then demoting Damon in turn, again for simplicity's sake we'll say from second to seventh. Now instead of an 18-at-bat difference, we're talking about a 90-at-bat difference. Swapping out 35 points of on-base percentage over 90 at-bats is a matter of 3.15 outs, or more than a full inning. True, that's just barely over one inning out of 1458 or more, but I guarantee you there will be at least one game this season in which that inning could have provided the Yankees with the opportunity to win a game they will wind up losing. Given the fact that the Yankees only won the division last year because of a tie-breaker, and actually wound up in a three-way tie with the Red Sox and Angels at season's end, one could be excused for believing that the margin for error the team so long enjoyed has evaporated completely, making every inning crucial to their chances of winning the division. Then again, one could make the same point about the 0.792 outs lost by batting Damon ahead of Jeter rather than the reverse. As Brenda Holloway sang, every little bit hurts.
The truth of the matter, however, is that the Yankees season will be decided by much larger swings of fortune and misfortune by than the less than 1/1000th of their outs lost to less than ideal line-up construction. So bicker and banter all you want about who bats where, but bear in mind that, given the uniform excellence of the players being discussed, how little is truly at stake.
This isn't actually true given the statistics.
http://www.hardballtimes.com/main/article/constructing-lineups/
For those uninclined to click the link, here is an excerpt:
Here's the most important strategic guideline for lineup construction:
The Book Says:
Your three best hitters should bat somewhere in the #1, #2 and #4 slots. Your fourth- and fifth-best hitters should occupy the #3 and #5 slots. The #1 and #2 slots will have players with more walks than those in the #4 and #5 slots. From slot #6 through #9, put the players in descending order of quality.
I'd add that "The Book" also says there is some value to putting your best basestealer in the #6 hole. A stolen base is more valuable ahead of singles hitters than ahead of sluggers.
So based on that, I'd guess the Yankees lineup by "The Book" would go:
Sheffield
ARod
Jeter
Giambi
Matsui
Damon
Posada
Phillips
Cano
Rilke, I say that 82 percent success on the bases is "significantly" higher than 75 because 75 percent is just about the break-even point at which a players caught stealings hurt his team exactly as much as his stolen bases help, thus Knoblauch's stealing was of marginal if any use, whereas Damon's stealing is actually quite helpful.
Again...great job w/ your analysis.
In real life, players have feelings. If it was obvious that Barry Bonds (it is to me) batting second would help the team win more games then everyone would bat their stud second.
The real question is, how often does this prevent teams from doing well? In the National league, I would guess somewhat often because of their insistence on playing a guy who can handle the bat(read:can't hit, probably doesn't walk) in the 2 hole. There is often a big difference between this player and the best hitter. Not to mention all the Tony Womack's of the league who are ill-placed in the lead off spot.
I think a lot of us had the idea that the batting order didn't really matter, especially regarding Damon and Jeter. It's fascinating to read the analysis as to why this gut reaction is so.
I think 8. has a good point in that these guys are accustomed to batting in certain slots and that plays a role. Even if it makes very clear sense to bat Damon 6th, to do so would be a big distraction. Ego, convention, and media hunger all play a role in ignoring statistical facts. Oh well, it would be cool to see a team really go with an unconventional lineup for a whole year.
maybe the Royals can do it?
I read Cliff as follows:
A is significantly greater than C (Stolen base = runs scored percentage = ~72%)
B is not significantly greater than C.
A is a significantly better base stealer.
I can buy that logic - even if the numbers aren't there to formally test the variance.
Somehow, I think the season is much more dependent on the health and effectiveness of the starting 5 than upon Torre choosing between Damon, Jeter and Jeter, Damon.
Crud. I'm not going to pay $ to watch freaking preseason games. Dumb move MLB; we are starved for baseball, give us these exhibition games for free. And put them on tv for free, you might have someone wander by and decide to watch and become a fan.
Anyone with a picture (mlb.tv or YES for that matter) have a further update on what's going on in the game? Did Chacon pitch well? (Not that ST stats really mean anything . . .)
Torre has Myers pitching to a WHOLE bunch of righties at the top of the 7th... Guess what happened. He gave up two runs.. Man on 2nd, 2 outs at the moment.
Don't be silly - sample sizes are still important. Damon and CK - that's a fair comparison.
and thanks Justin. Hopefully Torre scratched that Myers itch and has learned. The the other hemisphere tells me we're going to see this movie again (and again).
I wasn't distorting anyone's argument, knowingly or otherwise, thank you very much - I take it you still don't get the point that sample size really matters. I did cite bad info on Giambi's baserunning - my bad for misreading the ESPN link I looked at (and not being skeptical since 0 pick-offs, blown hit/runs, whatever is entirely implausible). (They seem to think he's 13/23, probably what you meant to write.) Just substitute Phillips in the argument.
Of course you realize we'll see the Statue of Liberty put down her torch, and do the backstroke to Hoboken before we see Joe Torre submit a lineup that doesn't begin Damon-Jeter, or Jeter-Damon.
Great to see the action on YES today, despite some horrific pitching.
Cliff's guy Phillips went 2-2 with a HR. His campaign is off to a good start.
Damon went 2-2 in his first ABs for the NY Good Guys.
Chacon got off to a nice start.
Cano looks bigger, got dirty making a good grab.
Myers performance might have delayed Al Leiter's flight to Team USA.
Regarding the Myers facing righties issue: The pitching order for games this early in spring training is set up before the game, is it not? That is to say, Myers was coming into the 7th regardless of the hitters' dominant handedness.
It was nice to see JB Cox pitch, by the way. Other than the REALLY high pitch that was hit for a HR, his other stuff was down with lotsa movement.
Say you have an 80% chance of doing something, in the sense that if you try it a zillion times you'll succeed 0.8 zillion times. So now you try 100 times. Will you succeed 80 times? No, you'll succeed 70 times, or 83, or ... Say it's 70. You don't have the luxury of trying a zillion times, and a scout says Cliff's rate is 70%.
Alex has a rate of 65%, but he has a good season by chance and succeeds in 77 attempts out of 100. The scout says Alex is better than you. I pop up and say, no, given the statistics you can only distinguish at around the 20% level. To figure out an estimate for the statistical variation (i.e., the variation not due to Varitek's vs Posada's arm, or better groundskeepers in the Bronx, or Torre sending Jeter in unfavorable counts), you might take the sqrt(n_failures)/failures here. Thus Damon's failure rate is known to just 25%, and he may really have a 78% success rate. Knoblauch got thrown out about 36 times, sqrt of which is 6, so his rate may have been 118/149 or 79%. Ok, really one adds the variances in quadrature when comparing the rates but anyway, my point is that 100 steal attempts is not a lot when you're trying to separate 82% from 75%.
As for Myers, I'm guessing that he's just getting his work in, like everyone else, and we won't see him pitch to righties once the season begins. Having seen what Myers did for the Sox the last 2 years, I'm pretty sure Torre knows how he should be used. After all, spring training would probably be pretty useless for Myers if he only throws when lefties hit. Have faith!
Definitely not giving up on Myers, Shaun, but he was awful today.
Didn't see a snowflake at all yesterday. Sounds like most of the storm was in your neck of the woods, Shaun.
Of course, watching spring training on my laptop at work certainly made it feel just a little warmer.
I know I'm gonna' get jumped on for saying this, but nonsense like these Batting Order generators just serves to prove why relying 100% on statistics is an inferior way to run a ballclub.
I don't care what the "theory of the day" is, it always ends up getting disproved later on. Seven years ago, OPS was the be-all, end-all batting measurement in every stat guys mind...now the same people tell us it's totally flawed. No duh.
I don't need a statistical analysis to tell me that putting A-Rod in the 2 slot during the playoffs last year was a terrible mistake. I watched the damn games! He saw far fewer good pitches to hit in that spot, and that's why he drew so many walks in key situations -- taking a walk was actually the best thing for him to do when they wouldn't pitch to him.
Without even thinking about this "best hitter in the 2 slot" theory, I can point out flaws in it:
1) Assuming Jeter is batting #1, in 2005 he had 202 hits and 77 walks...279 times he got on base (not counting errors and FCs). 40 times he had a double or triple, and 14 times he stole a base. So at BEST, (because some steals were of third after a double) he was on first base 81% of the time, second or third 19%. By this theory, I should now send up my best RBI man when the leadoff guy isn't even in scoring position? Really? You might convince me of that if the slugger hits a ton of HRs but has a low BA...maybe. But then of course, he'll probably hit into a lot of DPs, and I certainly can't hit and run with him since he probably Ks like 150 times a season.
2) The theory assumes that your lineup decisions occur in a vacuum. If you put your #1 OBP guy first, and your best slugger #2, do you really think I'm going to give him something he can jack with your guy on 1st? With a lesser power / RBI guy coming up in the 3 and 4 slots? If we're talking Yanks or Sox, (with those killer back-to-back RBI guys) you might have a chance. Most teams, you just wasted your RBI man. The opposing manager's strategy will change in response to your move.
3) As someone else pointed out, players have personalities, quirks and egos. After three+ decades we have just gotten to the point where computer AI can pretty consistently beat a Grand Master in chess. Fortunately for the computer, the pieces on the board can be counted on to perform under very exact rules, the exact same way every time. They never slump, and they never get pissed off. Even so the Grand Master beats them a lot. Give the pieces personalities and we'll see how effective the computer is.
Sorry for the long rant, but I think that lineup generator is total BS. LOL
Comment status: comments have been closed. Baseball Toaster is now out of business.