My writings about baseball, with a strong statistical & machine learning slant.

Monday, February 15, 2010

Fastballs for dinner?

This post is over due.

A few months ago, I invented a new statistic to measure the depth of a pitcher's repertoire. Using such a statistic, we can say which pitches a particular pitcher relies on. Then, we can show which pitches are used by what percentage of pitchers, within the major leagues. That is what I am going to show below.

You may ask: is this different from graphing how many pitchers throw a curve at least 10% of the time? Yes, it is. Let me explain repertoire depth with a couple of examples. Or you can just skip ahead to the graphs.

-----

Take Rudy Seanez. In 2007, he threw 76 innings for the Dodgers as a right handed reliever, posting a 3.79 ERA with 1 save. According to FanGraphs pitch data, his pitch distributions were as follows:

pitch:% thrown:
FB53.0
SL31.6
SF13.0
CH2.3

This distribution translates "2.17 depth from 3 offerings."

Offerings is a sort of upper bound on the pitcher's repertoire, while depth is a lower bound. Intuitively, the number of offerings would correspond to how many pitches a hitter should look for, while depth is a measure of how many core pitches a pitcher has. You can see the details in my original post (I'm looking at the harmonic mean of the pitch percentages). But basically, any pitch ranking lower than the number of offerings is something that the pitcher throws too rarely for us (or the hitter) to care about. While depth is a way to rank pitchers in order of their repertoire depth.

Still confused? Let's take a look at another example. Mariano Rivera is the best one-pitch pitcher in baseball. Here is how his repertoire broke down in 2008:

pitch:% thrown:
CT82.0
FB18.0

That's it. He throws a lot of cutters, and also some fastballs. I have him at 1.18 depth from 2 offerings. The truth is a little bit more complicated, since I am not considering pitches thrown to RHBs and LHBs separately. Dave Allen shows that Mariano actually throws both cutters and four-seam fastballs to RHBs, but only cutters to LHBs. Then again, I don't know where to get pitch data splits for years before 2008. And I would major small sample issues.

Contrast Mariano's repertoire to that of Roy Halladay's from 2009:

pitch:% thrown:
CT41.5
FB31.7
CB22.2
CH4.6

According to my system, his repertoire has 2.81 depth from 3 offerings. Somehow that sounds efficient, right? You don't need know a harmonic mean from a harmonica to see that the talented Mr. Halladay throws three pitches, with pretty much equal likeliness. So he's got 3 pitches, each of which are a "core pitch" for him.

----

Let's get back to those graphs I promised. For each pitch that FanGraphs catalogs for us, I count how many pitchers include that pitch in their offerings. That's simple enough. I look at his top X pitches, where X is that pitcher's number of offerings.

I do the same thing for a pitcher's repertoire depth. Roy Halladay has a depth of 2.81. Round that to 3. So we take his top 3 pitches. His three core pitches (cutter, fastball and curve) all count. Rudy Seanez (of 2007) has a depth that rounds down to 2, so we take his fastball and slider as core pitches. His splitter is #3, so it counts for offerings, but not for depth. The breakdown for Mariano is simple.

-----

If we perform the computations above for all pitcher season from 2002 to 2009, we can get a count of how many pitchers use each pitch, and how many pitcher use each pitch as a "core pitch."

We can also compare how these pitch distributions differ between starters to relievers. We all know that relievers' repertoire depths are smaller than those for starters. But how are they different? Which pitches do relievers rarely throw?

Here are the graphs. Each pitch is shown as a percentage of pitchers who include it in their offerings, and a percentage of those who include it in their depth.





And here are just the lefties (starters & relievers):



For those who love small samples, here are the lefty starters (still 318 pitcher seasons):


Alternatively, I have the same data in chart form. First, the pitches by offerings:

pitch:% of pitchers:% of starters:% of lefties:% of lefty starters:
FB99.699.399.9100.0
SL72.570.872.763.2
CT8.114.27.718.2
CB49.566.750.767.6
CH59.877.668.089.0
SF8.110.21.22.5
KN0.51.100
pitches2.983.403.003.41

And similarly for depth:

pitch:% of pitchers:% of starters:% of lefties:% of lefty starters:
FB99.399.099.9100.0
SL43.840.243.126.4
CT4.37.53.911.3
CB22.431.323.631.1
CH22.228.132.047.8
SF3.14.20.30.6
KN0.51.100
pitches1.942.102.012.19

Well that's a lot of numbers. You can draw your own conclusions, but a few things stick out to me:
  • Starter do indeed throw more different pitches than relievers. On average, they have about 0.5 more offerings, and 0.5 more depth. Starters are more likely to use each of the different pitches than relievers. The exception is sliders. Relievers are more likely to rely on their slider than starters
  • All pitchers rely on their fastball.
  • Lefties throw the same pitches as non-lefties. Except that lefties are more likely to throw change-ups, and lefties almost never throw splitters. Then again, the change-ups could be a classification issue, since lefties tend to throw less hard than righties.
This article is long enough already, so I'll stop here. If you would like to see something else, please let me know.

No comments:

Post a Comment