My writings about baseball, with a strong statistical & machine learning slant.
Tuesday, November 16, 2010
Why R makes more sense than R^2 (in looking at correlations in baseball)
Friday, September 10, 2010
Tim Lincecum: home runs, strikeouts and fastball speed
By far the biggest difference between 2008-2009 and 2010 for Lincecum have been his home runs given up. He gave up home runs on 6% of fly balls in 2008-2009, but 10% of fly balls in 2010. Tim's xFIP (FIP with league-average for HR per fly ball) is only slightly up from 2008-2009.
So if someone asked you: "what's wrong with Tim Lincecum in 2010?" you would tell him that his luck with fly balls going out has turned for the worse. Then you would debate how much a pitcher controls his HR/fly ball rates. But other than home runs given up, has anything else changed about Timmy the Freak?
Eric Seidman of Baseball Prospectus recently wrote an article on Lincecum, attempting to break down the differences in his repertoire between 2010 and 2008-2009. It's unclear what changed, other than the drop in fastball speed, and also a decrease in fastball movement. Seidman goes on to suggest that, if Lincecum's fastball is slower an "has less bite," then it may make the rest of his pitches less effective, even if they are the same pitches as before.
There are a lot of moving parts here, so let's just look at his declining fastball speed compared to his declining strikeout rates. Of all the changes, these are the most easily noticeable "cause" and "effect." Numbers from Lincecum's FanGraphs profile.
Year | SO/9 | average FB | % fastballs |
---|---|---|---|
2007 | 9.2 | 94.2 | 69% |
2008 | 10.5 | 94.1 | 66% |
2009 | 10.4 | 92.4 | 56% |
2010 | 9.7 | 91.3 | 55% |
The trend is weak, but is looks like Tim Lincecum's strikeout rates are declining with his fastball slowing down. Lincecum improved as a strikeout pitcher from his 2007 rookie year, but is now probably maximizing his strikeout ability, relative to his physical skills. As the physical skills decline, so will his ability to strike batters out.
Why am I looking at fastball speed in predicting strikeout rates, and not various other pitcher skills or characteristics?
In a series of studies I did last winter I found that, of all the non-performance pitcher attributes, fastball speed was by far the most predictive in terms of predicting strikeout rates. I can predict a pitcher's strikeout rate (given a reasonable IP cutoff) with R=0.52, given just his average fastball speed. The predictive power only goes up to R=0.59 if I also look at what other pitches he throws, his league, his age, his weight, and whether or not his is left handed. Of those factors, the league and handedness are by far the most helpful. See my old article here for more details.
The most important relationship that I discovered in my research was that 1. fastball speed predicts strikeout rates to a significant degree 2. the relationship is non-linear.
Here are fastball speeds vs strikeout rates 2002-2009 pitchers. Strikeout rates are fit by a quadratic function (ie square of the fastball speed). Note the discrepancies by league:
A fastball velocity drop from 95 mph to 90 mph in the NL (in blue) is worth a strikeout rate drop from 9.0 SO/9 to 6.5 SO/9. That's a big difference. For a starting pitcher, that's a drop from being an elite strikeout pitcher to being league average.
Tim Lincecum has dropped from a 94.1 mph pitcher to a 91.5 mph pitcher. According to the graph above, that should be worth a strikeout drop of about 8.5 - 7.0 = 1.5 SO/9. Even for someone who strikes out ten batters per nine innings, a 1.5 SO/9 drop is enough to taking him down from elite pitcher to just very good.
Tim Lincecum is not a typical pitcher. He over-achieved his fastball-based strikeout projection at 94 mph, and he's over-achieving his strikeout projection at 91.5 mph. But the strikeout drop is still there. If Lincecum's fastball declines further, he won't any longer be a ten strikeout per nine innings pitcher. To record sub-3.00 ERA's, he'll have to bring down his walk rates from around 3.0 BB/9 to more of a Cliff Lee type level, or he will have to get lucky with keeping fly balls in the park at well below the league average.
He is not so old that decreases in fastball speed are inevitable, but I'm guessing that Tim Lincecum is too young to increase his fastball speed, the way that Zach Greinke did 2005-2007. Nor does he have the strikeout rate advantage of being left-handed (worth about +1.5 SO/9 for the same fastball speed, according to my study).
There are many components that go into being a great pitcher. But strikeout rates and walk rates are the two factors that pitchers have the most control over. I'm not saying that Lincecum's 10.0 SO/9 rates from the last couple years were a fluke. But he will have trouble maintaining those into the future as his fastball speed has already declined. Look for him do decrease his walks. If he is not able to do so without further decreasing the strikeout rates, Tim Lincecum can't be considered the best pitcher in baseball going forward.
Tuesday, August 24, 2010
Incomplete thoughts on ground-ball pitchers
I've said it a thousand times, but. . .I don't believe in ground ball pitchers. I don't trust them, I don't want them, and I don't believe one should ever invest money in them. In theory, a ground ball pitcher with a good strikeout rate is the best of both worlds. But the problem is, there just aren't any pitchers like that who are consistently good; they all either get hurt or they lose home plate. The only pitcher like that who has had a great career in the last 30 years was Kevin Brown. The overwhelming majority of the consistently good pitchers are the guys who live off of the high fastball--Clemens, Schilling, the Unit, Pedro, Santana, King Felix, Verlander, Sabathia, etc.When I left off my baseball research, I left off with a classification of pitchers by the type of pitches that they throw. Dave Allen pointed out that I should look at pitchers who throw two-seam fastballs, as those pitchers have become the subject of much sabermetric discussion. Two-seam fastballs induce ground balls like no other pitch, and the value of ground balls for pitchers has become a hotly debated topic. (By hot, I mean that multiple analysts are competing to show how much value ground balls really have for pitchers.)
I created a new category of pitchers, centered around those that throw a high percentage of two-seam fastballs. Indeed, this category of pitchers had very high ground ball rates (something like 6% higher than average), but also lower strikeout rates (about 0.5 K/9 less than average). I was going to write an article about whether or not this "tradeoff" is worth it.
But Bill James brings up a better point. Who are the great two-seam fastball ground ball pitchers out there? Clearly Brandon Webb has to be the most famous example. But let's consider the others. I only had reliable two-seamed fastball data for 2009, so all examples have to be from last year. Here are the most name-recognizable pitchers who classified as "type 8: two-seam fastball pitcher" by my scheme. All data courtesy of PF/X posted on FanGraphs.
Pitcher | 2009 FT% | 2010 FT% |
---|---|---|
Joel Piniero | 28% | 49% |
Brian Matusz | 14% | 21% |
Scott Kazmir | 10% | 0% |
Rick Porcello | 22% | 52% |
Francisco Liriano | 12% | 46% |
Fausto Carmona | 9% | 33% |
Chien-Ming Wang | 23% | NA |
Carlos Silva | 43% | 50% |
Ignoring Scott Kazmir, who no longer throws two-seam fastballs, and the hobbled CM Wang, is there anything we can generalize about the two-seam fastball pitchers?
First of all, none of them are backing off the pitch. This is selective, since I chose the most recognizable proponents of the pitch, and PF/X pitch classifications are not consistent year to year. Still, I think this suggests that two-seam fastball pitchers are on the rise. How is it affecting their stats?
All of these pitchers are recording high GB% stats on the season, except for Brian Matusz. Joel Pineiro leads with 56%, and none of these guys except Matusz are below 45% (league average is in the low 40% range). Accounting for randomness, these pitchers are all getting high ground ball rates, in part due to their use of the two-seam fastball. However none of them except for Liriano are having top-level season. Here are the 2010 strikeout rates (K/9) for those pitchers:
Pitcher | 2010 K/9 |
---|---|
Joel Piniero | 5.7 |
Brian Matusz | 6.9 |
Rick Porcello | 4.5 |
Francisco Liriano | 9.8 |
Fausto Carmona | 4.8 |
Carlos Silva | 6.3 |
Not surprisingly, Francisco Liriano has a 3.45 ERA, despite a very unlucky 0.350 BABIP. He is having a "Kevin Brown" season, as Bill James would describe it, with both high GB% and high strikeout rate. However the other pitchers have league-average strikeout rates at best. Fausto Carmona has the stuff (93 mph average fastball) to be a high-strikeout pitcher, but he has never realized that potential (even during his 19-win season in 2007, he was a low-strikeout pitcher). It is very unlikely he will become even a league-average strikeout pitcher at this point in his career. Joel Pineiro was dominant earlier this season when he was getting 70% ground balls, but his ERA and FIP have settled around 4.0 now that his ground ball rate is a more sustainable 56%. Without above-average strikeout rates, a pitcher's long term ceiling might be that 4.0 ERA. Not bad, and worth a couple of WAR, but not in the elite pitcher echelon.
Brian Matusz in an interesting case. He was a high strikeout guy in the minors, and has had a league-average strikeout rate over his first 200 major league innings. He throws a two-seam fastball according to PF/X, but he doesn't get high ground ball rates. I'm not sure what's going on there. Maybe he just doesn't belong in this list.
Overall, though, I think Bill James's point is well taken. You can't be a great pitcher on ground balls alone, at least not over a course of several years. You need to have strikeouts. Francisco Liriano might be the next pitcher to maintain high GB rates with high strikeout totals. But he'll have to prove it over more than one season.
Projecting Liriano in April
I'm happy to see Liriano having a great season. He's endured a few injury setbacks, and I'm happy to see him finally come back to form. Also, my pitcher projections were very favorable for him, and it's always pleasant to be right on something like this.In the projection I published in April, here is what my system predicted for Francisco Liriano in 2010:
- 159.6 IP; 20.9 VORP; 4.31 ERA; 4.24 FIP
Of course, he has been much better than that. But it was bold of my system to project him for a full season, and to be in the top 50 most valuable pitchers in MLB. In 2009, he was 5-13 with a 5.80 ERA in 136 IP. I think this is a real win for my injury-based projection adjustments.
With the season finishing up, I will go my predictions more systematically. As you can see from the link above, I was off on quite a few of them. I was probably more pessimistic on Cliff Lee than most (in part due to his injury in camp). I got fooled on Javier Vazquez. And I thought John Lackey would be a workhorse, rather than a dud.
But this is all a topic for a future post. Til then...
Wednesday, July 28, 2010
Andre Dawson, Bert Blyleven, Johnny Damon and Miguel Cabrera
Jorge Posada came up when he was 23, but he didn't become a regular until he was 26. Since then, he's been the top offensive catcher in the AL just about every year. And although I'm sure no one was talking about it in 2000, Jorge Posada compares favorably to the Hall of Fame's current battery of backstops. He hasn't been Johnny Bench or Mike Piazza or Garry Carter, but he was pretty damn good for the past decade and change, and he ain't done yet.
When I checked Jorge's profile on Baseball Reference, I wasn't surprised to see his closest comps to that of Carlton Fisk, but I was a little surprised that he only ranked 26th on the list of WAR (wins above replacement) among currently active players. I clicked ahead to see the top 50 active players by career WAR. Man it's a hell of a list. This got me thinking.
Last year Bill James wrote an article called "The Expansion Time Bomb." [Unfortunately it's behind the paywall on his site.] Bill argued that, as baseball has expanded since 1969, so too has the number of players reaching levels "historical achievement" that typically define a Hall of Fame career. In other words, in an expanded league, there will be more players with 3,000 hits, more players with 500 home runs hitters, more 300 game winners, and otherwise more milestones being reached. This seems intuitively true, but it is also very hard to argue, and harder to verify. Still, his main argument is an interesting one (which I paraphrase below):
Most supporters want the Hall of Fame to be an exclusive club. This inherently means restricting membership to a small number of entries per year (or decade, or other time period). As expansion has led to more players with historical levels of achievement, Hall of Fame standards will tighten to levels much more narrow than those used in the past.We'll come back to this thought in a minute. First, let's look at the current top 50 in baseball by career WAR. How many of them are Hall of Fame players? (By the way, WAR is simply a measurement of career "wins added" above a replacement-level player. The merits of WAR are not important here. It is just a way to place all active players, regardless of position, on a rough universal career ranking.)
For each player, I rate his Hall of Fame chances as "yes," "probably," "maybe," or "no." I assume a conservative estimate for the rest of his career. In other words, will he make the Hall of Fame, based on today's standards, if he doesn't do much for the rest of his career? I'm ignoring steroids and just focusing on performance.
- Alex Rodriguez (34) - Yes
- Albert Pujols (30) - Yes
- Chipper Jones (38) - Yes
- Ken Griffey (40) - Yes
- Derek Jeter (36) - Yes
- Jim Edmonds (40) - Probably
- Jim Thome (39) - Probably
- Manny Ramirez (38) - Yes
- Ivan Rodriguez (38) - Yes
- Scott Rolen (35) - Maybe
- Andruw Jones (33) - No
- Vladimir Guerrero (35) - Yes
- Todd Helton (36) - Probably
- Bobby Abreu (36) - Maybe
- Carlos Beltran (33) - Probably
- Jason Giambi (39) - No
- Ichiro Suzuki (36) - Yes
- Mariano Rivera (40) - Yes
- Roy Halladay (33) - Probably
- Andy Pettitte (38) - Maybe
- Johnny Damon (36) - Maybe
- Mike Cameron (37) - No
- Jamie Moyer (47) - No
- J.D. Drew (34) - No
- Johan Santana (31) - Probably
- Jorge Posada (38) - Maybe
- Lance Berkman (34) - Maybe
- Tim Hudson (34) - No
- Omar Vizquel (43) - Probably
- Roy Oswalt (32) - Probably
- Mark Buehrle (31) - Maybe
- CC Sabathia (29) - Probably
- Adrian Beltre (31) - No
- Miguel Tejada (36) - No
- Javier Vazquez (34) - No
- Jason Kendall (36) - No
- Chase Utley (31) - Probably
- Magglio Ordonez (36) - No
- Joe Mauer (27) - Probably
- Eric Chavez (32) - No
- Mark Teixeira (30) - Maybe
- Troy Glaus (33) - No
- Barry Zito (32) - No
- Placido Polanco (34) - No
- Carlos Zambrano (29) - No
- Tim Wakefield (43) - No
- Rafael Furcal (32) - No
- Edgar Renteria (33) - No
- David Wright (27) - Maybe
- Miguel Cabrera (27) - Probably
- Yes = 1.0
- Probably = 0.7
- Maybe = 0.3
- No = 0.0
Out of the fifty players, we get twenty one Hall of Famers, breaking down as follows:
10 * Yes + 12 * Probably + 9 * Maybe + 19 * No = 21.1 Hall of Famers
A Hall of Fame career is typically 16 to 20 years, so in theory, this list represents 16 to 20 years' worth of Hall of Famers, assuming these are evenly distributed through time. However, the list does not include a single player under 27. Hanley Ramirez, Zack Greinke and Tim Lincecum are not accomplished enough yet to be considered possible Hall of Famers for this discussion. Therefore, let's say that the top fifty players by WAR includes all possible Hall of Famers over fifteen years (ie age 27 to age 42).
If my list is reasonably accurate, this suggests that we will induct twenty one players over every fifteen years, if the future performance is much like the recent past.
To me, that sounds very reasonable. With an average of 1.4 new qualified candidate per year, the Hall of Fame would be electing zero to three players every year. Yes, they will be electing more candidates per year than in the recent, but not by much. It will he harder for borderline candidates to get in, but there would never be backlogs of qualified recent candidates running ten deep. There will be years with no obvious Hall of Famers on the ballot, and in those years, weaker candidates will still have a chance to be elected.
While I still think that Bill James's argument sounds appealing, I just don't see the glut of Hall of Fame level performers driving up future Hall of Fame standards significantly. Instead, we will see more years with one or two good new candidates, and fewer multiyear stretches where the best candidate on the ballot is Phil Neikro or Jim Rice. But those borderline cases will still get plenty of consideration. When he is up for Cooperstown, Johnny Damon will have more competition on the ballot than did Andre Dawson and Bert Blyleven, but his career will be just as thoroughly vetted as those two's were.
According to Wikipedia, there are 203 former players currently in the Hall of Fame. These represent the achievements in Major League Baseball of the last 100 years, as well as a few achievements from the 19th century. That's somewhere between 1.7 to 2.0 players per year of Major League Baseball, depending on who's counting. By my count, we will have 1.4 players per year in the future, based on conservative projections of today's stars. Even accounting for the Veterans Committee's past indiscretions inflating the 2.0 number, I don't see a tightening of standards that will exclude Johnny Damon, Bobby Abreu, Jorge Posada, Lance Berkman or Todd Helton from being considered as legitimate candidates. According to my estimates, two of those five guys will get in, and I think that's about right.
Wednesday, June 23, 2010
Status Update!
As per Dave Allen's suggestion, I created a category for two-seam fastball (sinking fastball) throwing pitchers. Also I did some analysis trying to figure out whether throwing lots of these pitches (ie qualifying for my new category over all the others) is an effective strategy. I had some game-theory ideas about giving up strikeouts for ground balls, etc. Then I just got busy with several other things. So what was going to be a week long delay turned into a month.
I've been in Vegas, playing a few WSOP events. Also, I've been working on a software project, and doing a lot of drawing.
Moreover, I've found that I'm less nuts about baseball that I was a few months ago. Baseball games are undeniably boring to watch in their entirety, and the season is too long for anyone to truly care about the result of any particular game. I still love baseball, but:
- None of my good friends do. Although they are all big sports fans.
- I never go out of my way to watch a game on TV.
- I don't get excited about seeing an MLB game in a new city when I am travelling.
Maybe I'll get back to baseball soon, maybe I won't. But for now, I would rather spend the afternoon drawing, than spend it writing and revising a baseball article. I'll probably go back and revisit my preseason pitcher projections around the All-Star break. But that would be more for vanity than from an impartial sense of interest.
In other news, I got an invitation to interview for the Diamondbacks statistical analyst job, but I turned it down (despite the fact that I respect their organization, and love the American Southwest). However this has more to do with my software projects than it has to do with my attitude toward baseball. It's a great job, and I hope the DBacks make a great hire. I'm sure they will!
Wednesday, May 5, 2010
Does Dave Duncan hate change-ups?
Type 3 pitchers seem to be Cardinals even though they're just 4% of pitcher seasons they made up 12.5% of the Cardinals' 16-man staff last season. Also, while Type 2 pitchers make up just 18% of the MLB population, they made up 31.25% of St. Louis' staff last season. They seem to be doing that by avoiding Type 0, 4 and 7 pitchers. I wonder if this could be a personal preference by pitching coach Dave Duncan. Do you have data that suggests some MLB teams look for certain types of pitchers and/or convince pitchers to use a certain percentage of their stuff?In other words, does Cardinals pitching coach Dave Duncan encourage his pitchers to become certain types of pitchers, and not other types? Duncan has been lauded on many blogs and baseball news sites over the past couple of years due to his staffs' repeated successes. He seems to have revitalized multiple pitching careers over the past few years, including Joel Piniero in 2009. Pitch F/X expert Dave Allen pointed out that Duncan's pitchers get more ground balls under his tutelage than they had before.
Is there a secret to Duncan's (perceived) success in reclamation pitchers? Does he turn pitchers into specific types that are more successful, on average, than other pitcher types?
John suggests above that Duncan's pitchers tend to be type 2 and type 3, but not types 0, 4, or 7, as compared to the league average last year. For those confused about the pitcher types, please read my article explaining the pitcher types. The types are derived from what I determine to be a pitcher's core and secondary pitches. All pitchers are assumed to throw the fastball as core pitch (I do not yet distinguish between two-seam and four-seam fastballs; coming soon, Dave). As a quick reference:
- type 0: change-up core; slider secondary
- type 1: cutter core
- type 2: slider core; change-up secondary
- type 3: slider and curve core
- type 4: curve core; change-up secondary
- type 5: change-up core; curve secondary
- type 6: slider core; no secondary
- type 7: splitter core; slider secondary
T_0%
|
T_1%
|
T_2%
|
T_3%
|
T_4%
|
T_5%
|
T_6%
|
|
Average
(1 STD)
|
7-19
|
2-10
|
12-24
|
1-8
|
8-20
|
4-13
|
39-26
|
Giants
|
10
|
6
|
17
|
6
|
11
|
7
|
41
|
Yankees
|
8
|
14
|
13
|
6
|
15
|
7
|
35
|
Cardinals
|
6
|
13
|
22
|
16
|
12
|
1
|
24
|
- Dave Duncan hates change-ups! Type 0 and type 5 are primarily change-up pitchers. Lots of really good pitchers have been type 5 (Greg Maddux & Tom Glavine, for example). However, very few of Duncan's pitchers fit this profile.
- Dave Duncan doesn't care for young flame-throwers (or he reforms them quickly). Type 6 pitchers are the most common type of major league pitcher, by far. Many, if not most pitchers come up to the majors as hard-throwing type 6 guys, featuring a fastball, a slider, and not much else. There is a dearth of type 6 pitchers on Duncan's staff, although the number is not ridiculously low. They still make up 24% of his staffs (league average is 33%, and the Cubs form the high-watermark at 48%).
Monday, May 3, 2010
Starter vs Reliever
trFIP = a + (start %) * b
trFIP = FIP + 0.81 (1 - start %)
trERA = ERA + 1.19 (1 - start %)