Cybermetrics

Friday, April 12, 2024

Some stats on how pitchers have fewer IP than about 10 years ago

I used Stathead to search this.

From 2011-13, 53 pitchers had 486+ IP & 22 had 600+ (2 had 700+). Verlander led with 707.

From 2021-23, only 25 pitchers had 486+ IP and only one had 600+ (Alcantara, 619). About a 52.8% drop in guys with 486+ IP.

12 pitchers from 2011-13 were above 619. Cole was 2nd from 2021-23 with 591. That would have been 28th from 2011-13.

I used 486 IP since that is 3 times 162, what it takes to qualify for the ERA title for just one season. So I am using 486 as a qualifier for a 3 year period.

For position players I used 3*502 = 1,516 PAs since 502 is usually the number of PAs it takes to qualify for the batting title for just one year.

From 2011-13, there were 104 guys with 1,516+ PAs. From 2021-23, it was 94. That is a 9.6% drop, far less than what happened for pitchers.

At the 2,000 PA level, it fell from 15 to 10. That is a much smaller drop than it was for pitchers with 600+ IP.

The highest anyone had from 2011-13 was 2,111 (both Alex Gordon and Starlin Castro). Two players from 2021-2023 actually exceeded that: Marcus Semien (2,201) & Freddie Freeman (2,133).

At the 1,600 PA level, there were exactly 82 guys in each time period. So it sure looks like something different has happened with the pitchers.

Update April 14:

Some data on pitcher games started. Number of guys with 90+ & 75+ starts.

2011-13:

90+) 41
75+) 64

2021-23:

90+) 19
75+) 46

Saturday, March 23, 2024

Norm Cash's 1961 season

He won the AL batting title that year with a .361 AVG. Yet he never hit .300 or higher again and his lifetime avg was just .271 (all data is from Baseball Reference and Stathead).

His OBP that year was .487. His career OBP was .374 and his next best was .402 (in 1960 in only 428 plate appearances).

His SLG was .662. His next best was .531 and it was .488 for his career.

His OPS+ was 201 that year and his next highest was 149. Lifetime it was 139.

I wondered if his flukiness was balanced against both lefties and righties.

This table shows his OPS vs. righties relative to lefties for each of his 14 full or close to full seasons

1960        1.50
1961        1.57
1962        1.43
1963        1.36
1964        1.54
1965        1.15
1966        0.94
1967        1.31
1968        0.97
1969        1.56
1970        1.21
1971        1.27
1972        2.11
1973        2.39

The 1.57 in 1961 is the highest until late in his career when his performance against lefties went down quite a bit. But it was just a bit higher than 1960, 1964 and 1969. So this does not indicate a great imbalance.

But, I also calculated his OPS vs. righties relative to the league average of all left-handed batters vs. righites (and the same was also done for vs. lefties).

Here is his year-by-year OPS vs. righties relative to the league average of all left-handed batters vs. righties:

1960        1.21
1961        1.62
1962        1.25
1963        1.26
1964        1.19
1965        1.24
1966        1.15
1967        1.22
1968        1.20
1969        1.24
1970        1.13
1971        1.30
1972        1.24
1973        1.18

The ratio in 1961 is by far the highest at 1.62 with the next best being 1.30. So a great year for him vs. righties.

Now for his year-by-year OPS vs. lefties relative to the league average of all left-handed batters vs. lefties:

1960        1.02
1961        1.21
1962        1.04
1963        1.03
1964        0.94
1965        1.30
1966        1.39
1967        1.08
1968        1.44
1969        0.92
1970        1.16
1971        1.23
1972        0.65
1973        0.55

He had 1.21 in 1961, but that is only his 4th highest ratio. He had four that were higher: 1.44, 1.39, 1.30 and 1.23.

So he had, compared to the rest of his career, a fantastic season against righties. But against lefties, it was just good.

Update March 25: From 1960-73, Cash had an OPS of .918 vs. righties while all left-handed batters had .733. His ratio is 1.25 (.918/.733). So his 1.62 ratio in 1961 was far above this.

Over the same period, his OPS vs. lefties was .696 while all all left-handed batters had .625. This ratio is 1.11 (.696/.625). His 1.21 ratio from 1961 was only slightly above this.

I only looked at 1960-73 since he did not get many PAs in 1958, 1959 & 1973.

There can be some idiosyncratic things going on here. For example, 20 of his 162 PAs vs. lefties in 1961 were against Whitey Ford. He had just a .417 OPS vs. him that year.

Cash only had 52 career PAs against Ford. So it is possible that 1961 was an unusually tough year for him in terms of the quality of the lefties he faced. But to conclude that would require looking each of his seasons to see who he faced. Also, in 1961, Ford seems to be the only good lefty that he faced fairly often.

Tuesday, March 19, 2024

Interesting new article by Bill James: The Competitive Advantage of the Pitcher’s Park

It is in the latest issue of By the Numbers: The Newsletter of the SABR Statistical Analysis Committee, edited by Phil Birnbaum.

Click here to read it.

Here is a synopsis from Phil:

"Bill James finds that teams who play in pitcher's parks have had better records, historically, than teams who play in hitter's parks. He presents the data showing the effect, and then offers a suggestion for why this may be happening."

Also in the issue, Charlie Pavitt reviews several recent studies from the academic literature.

Sunday, March 3, 2024

Factors that might influence the difference between ERA and FIP

My last post mentioned that Aaron Bummer had a 6.79 ERA last year while his FIP ERA was 3.58 for a differential of -3.21 (FIP - ERA). That was the largest absolute differential last year for any pitcher with 50+ IP and it was 0.64 larger than the next largest.

So to look at what might explain why ERA differs from FIP (fielding independent ERA estimated using SOs, BBs and HRs), I ran a regression with FIP - ERA as the dependent variable and the following three independent variables:

SLG Diff (a pitcher's SLG allowed with runners on base minus the SLG they allowed with no runners on)

BAbip (the batting average a pitcher allows on balls in play, so it is dependent on how good his fielders are)

BQS/9 (Bequeathed runners that scored per 9 IP).

Bequeathed runners represents the number of runners left on base by a pitcher when that pitcher leaves the game. Any bequeathed runner who scores an earned run after a pitcher has left the game will be counted against that pitcher's ERA (from mlb.com https://www.mlb.com/glossary/advanced-stats/bequeathed-runners).

I used SLG Diff because some pitchers might have gotten hit pretty hard when they had runners on base, making their ERA higher than what we might otherwise expect based on their overall numbers.

I used BAbip because this is not controlled very much by the pitcher. A guy can have a low FIP but if his fielders can't catch the ball, his ERA will be high.

I used BQS/9 because a pitcher cannot control what happens after he leaves the game. Some pitchers get lucky and their bullpen bails them out. For others, it is the opposite.

I looked at all the guys who had 100+ IP last year. All data came from Baseball Reference and Stathead. There were 127 pitchers.

Here is the regression equation:

FIPERADIFF = .789*BQS/9 + 13.63*BAbip + 2.5*SLGDiff - 4.32

r-squared = .639, so 63.9% of the variance in the dependent variable is explained by the model.

standard error = .33

Here are the t-values for the three independent variables:

BQS/9) 6.7 (The p-value is < .00001)
BAbip) 11.6 (The p-value is < .00001)
SLG Diff) 5.04 (The p-value is < .00001)

The r-squared seems fairly high but it still means that 36.1% of the variation in FIPERADIFF is not explained.

The standard error seems high. I wish it was lower. The average absolute differential was about .44.

The t-values are all pretty high so each independent variable is significant. I used a website that converts t-values into p-values.

There may be some other variable that I should include. Maybe I could find the estimated FIPERADIFF for each guy and look at the 10 or so guys with the biggest differences between the estimated value and the actual (FIP - ERA). Maybe something that would be obvious to include would pop up.

Monday, February 19, 2024

Which pitchers had the biggest differences last year between their actual ERA & their FIP ERA? (or Bad Luck is an Aaron Bummer)

FIP ERA means fielding independent ERA and it is an estimated ERA based on what the pitcher controls: walks, strikeouts and HRs. See Baseball By The Numbers—Earned Run Average (ERA) and Fielding Independent Pitching (FIP) by Marilyn Green at "Redbird Rants" for more information and the formula.

I used Stathead from Baseball Reference to call up all the pitchers who had 50+ IP last year. Then I found the difference between their actual ERA and their FIP ERA.

Table 1 shows the guys who had the worst luck, that is they had the biggest negative differentials for FIP ERA - ERA.

Table 1

Player	IP	ERA	FIP	Diff
Aaron Bummer	58.1	6.79	3.58	-3.21
Shintaro Fujinami	79	7.18	4.61	-2.57
Zach Davies	82.1	7.00	4.58	-2.42
Hogan Harris	63	7.14	5.02	-2.12
Fernando Cruz	66	4.91	2.83	-2.08
Dylan Floro	56.2	4.76	2.96	-1.80
Michael Grove	69	6.13	4.36	-1.77
Connor Seabold	87.1	7.52	5.75	-1.77
Osvaldo Bido	50.2	5.86	4.10	-1.76
Josh Sborz	52.1	5.50	3.75	-1.75

Table 2 shows the guys who had the best luck, that is they had the biggest positive differentials for FIP ERA - ERA.

Table 2

Player	IP	ERA	FIP	Diff
Wandy Peralta	54	2.83	5.05	2.22
Héctor Neris	68.1	1.71	3.83	2.12
Tom Cosgrove	51.1	1.75	3.70	1.95
Brusdar Graterol	67.1	1.20	3.03	1.83
Kendall Graveman	66.1	3.12	4.88	1.76
Dominic Leone	54	4.67	6.29	1.62
Clayton Kershaw	131.2	2.46	4.03	1.57
Wade Miley	120.1	3.14	4.69	1.55
Bryse Wilson	76.2	2.58	4.13	1.55
Ronel Blanco	52	4.50	5.99	1.49

Bummer's differential is much greater than anyone else's. I thought that maybe he got hit hard with runners on base, but his splits don't reflect that. Did the White Sox have bad fielding? Maybe, their team FIP was 4.71 while the team ERA was 4.87. Bad, but not too bad. Certainly not close to what happened to Bummer. And if it was the fielders, we would see high numbers for his AVG & SLG allowed with runners on base, but again, that was not the case (that would not be the whole story but at least part of it). His splits can be found easily at BR.

Maybe the guys who came in after he had put some runners on just did really bad so that he got charged for the runs. I am not sure if there is an easy way to tell that.

Others have looked at the issue of what explains the difference between FIP and ERA before. I might post some of those links when I get a chance.

Update Feb. 22: Here are Bummer's AVG-OBP-SLG for his splits last year:

None on) .240-.365-.365

Runners on) .233-.342-.333

Looks like he had good numbers with runners on. He did allow .340 BAbip while his AVG allowed in all situations was .236, a .104 difference. For all of the AL last year those numbers were .295/.245. For the NL they were .299/.251. So Bummer's differential here was twice the MLB average. That might partly explain why his FIP/ERA differential is so large.

Here are some studies on this issue:

The FIP/ERA Gap: Historically by Glenn DuPaul of Beyond the Box Score.

Evaluating the Gap Between ERA and FIP by Christopher Rinaldi of FanGraphs.