My writings about baseball, with a strong statistical & machine learning slant.

Tuesday, March 16, 2010

Excessive exuberance for young pitchers? (II)

In my last post, I claimed that projection systems tend to over-estimate playing time for young pitchers. I went on to claim that a system that ignored minor league stats and treated all rookies equally would give better overall projections for pitchers’ playing time. As it happens, I’ve built such a system.

As I have written about before, I’ve been working on a pitcher season projection system for some time. I started building my projection system with data from Baseball Prospectus, which does not include comprehensive minor league stats. I’ve never bothered to acquire minor league stats elsewhere, so in my system, all I know about a rookie is that he is rookie, and nothing else. Actually, I know other things (skills ratings, Baseball America ratings, fastball velocity), but here I will demonstrate a simple projection system, that only uses a pitcher’s basic MLB stats, rookie status, and age.

I will explain my system later. For now, let’s just say it’s a linear-weights system based on the past four years’ stats such as:
  • IP_Start, IP_Relief, rookie status & missed seasons
  • Wins, losses, saves, hold, VORP (total runs saved), WXR (relief pitching value in expected wins), etc

I trained my system on pitcher seasons from 2005-2009 that either:
  • Recorded at least 1/3 IP in the majors that season
  • Recorded at least 1/3 IP in the previous season

Therefore, my system takes account of the fact that, each year, 20% of pitchers do not return to the big show.

Perhaps I should remove the voluntary retirement seasons. However most pitchers retire when they can no longer pitch in the majors, so the current system is reasonable. In any case, it is better than not accounting for the fact that some pitchers who had decent seasons go directly to zero the next year.

After building my system, I commit the sin of training and testing on the same data. However, I could not leave out 2009 for testing only. I am using very little data anyway (only five years’ worth). I could give you the testing data on clean 2004 data, and later I will, but I doubt you’d. Just trust me that the system was built in a way that reduces over-training and keeps things very simple. If you are interested in the details, please read my next post.

In any case, I compared the predictions of my system for 2009 IP with last year’s preseason projections from CHONE and PECOTA. I tested on all pitchers where my system, CHONE and PECOTA all provided predictions. This consisted of all pitchers who had 2008 or 2009 major league IP, sans about 40 pitchers that were missing in either PECOTA or CHONE.

None of the omissions were notable pitchers. Although I do think that it’s notable that projection systems that project huge numbers of minor leaguers (mostly with unrealistic stats) still manage to miss some pitchers who make it to the majors. I think this goes to show the point I made in the last post: we really don’t know which minor leaguers will get playing time, so we might as well just play the percentages. But I digress...

It would be in bad form to post PECOTA and CHONE projections here, since I do not own this data. Therefore, I will just summarize the results, and offer a few examples.

For the full set of 2009 pitchers, my average projection error is significantly lower than either PECOTA or CHONE. Also, my IP projection beat PECOTA head-to-head 63.8% of the time, and it beats CHONE 66.3% of the time. Most surprisingly, my projection is best out of all three 54.8% of the time!

My system
IP error STDev
43.6 IP
55.5 IP
61.6 IP
Best projection
Mean error (IP)
-0.4 IP
25.8 IP
32.2 IP
Mean IP
54.0 IP
80.1 IP
86.6 IP

The last two rows here show just how much PECOTA and CHONE over-estimate playing time for an average pitcher. If we include O IP seasons for pitchers who played in 2008, the average pitcher in 2009 threw 54.4 innings. And yet, PECOTA projected the average pitchers for 80 innings, and CHONE for even more. These are not reasonable baselines (even if we removed the empty seasons).

You may argue that 0 IP is never a good projection for a major league pitcher. That is true. However, Ben Sheets, Tom Glavine and Kei Igawa all looked to have reasonable shots to pitch in the majors in 2009, and none of them did. I think my system does deserve credit for projecting lower totals for guys who don’t pitch at all.

Actually, my system still easily beats CHONE and PECOTA if we remove all 0 IP season. To make closer, I remove all seasons with <20 IP. With that, the average pitcher IP rises to 87.6, and all of the systems are equally effective:

My system
IP error STDev
45.9 IP
43.3 IP
47.0 IP
Best projection
Mean error (IP)
-19.7 IP
2.9 IP
7.1 IP
Mean IP
67.9 IP
90.5 IP
94.7 IP

Ok, so my system does better because I project fewer inning for pitchers who retire, right? Not quite. Pitchers that both PECOTA and CHONE projected to throw over 100IP, but who threw less than 20 IP include:
  • Ian Kennedy
  • Robert Mosebach
  • Mitch Atkins
  • Andrew Carpenter
  • Zach Johnson
  • Brandon Webb
Chances are, you know who 1.5 of those pitchers are, and only if you're a golf fan. No projection system saw Brandon Webb’s season-ending injury. However, what were PECOTA and CHONE doing projecting over 100 IP for the misfits?

As I wrote in my last post, I think these systems over-value minor league stats, and give unrealistically high projections to young pitchers, and especially to rookies.

Consider how the projection systems perform if we keep 0 IP seasons, but remove all seasons pitched by rookies (less than 50 IP of prior MLB experience):

My system
IP error STDev
45.2 IP
52.0 IP
59.1 IP
Best projection
Mean error (IP)
3.0 IP
18.0 IP
25.1 IP
Mean IP
67.8 IP
82.7 IP
89.8 IP

As you can see, my system is still best (since CHONE and PECOTA over-estimate non-rookie pitcher IP also), but not nearly by as much as with rookies included.

Sure, these systems look great when they nail their projections for Jeff Niemann, David Price, Tommy Hanson, Brad Bergesen and Andrew Bailey. However they miss more often than they hit. If you say that Jeff Niemann and Tommy Hanson are better prospects than Ian Kennedy and Clay Bucholtz, it is precisely because they had better 2009s that you think so. Kennedy and Bucholtz were also highly rated prospects a couple years ago, and if they succeeded by now, we would have said “but of course I saw that coming.” And perhaps you would be right. However, as PECOTA and CHONE show, on average, these prospects struggle more than we think that they should.

Ironically, good projection systems that use minor league stats get crushed by a simple system that does not use minor league stats for precisely that reason. I’m not saying that minor league stats have no use. But we should not put too much stock in great minor league stats, since teams clearly don’t.

No comments:

Post a Comment