He might have a point since each of these year-to-year differences was greater than the 1.1 calculated in
last newsletter.
He then went on a “fishing expedition” on the last 10 years of data (was his choice of 10 years arbitrary?). For those of you who are old enough, do you remember the “Excedrin headache # [x]” commercials? I think reading the following qualifies – a “masterful” explanation of explaining probable common cause as special
while at times calling it common, then making a special cause conclusion (see what I mean?):
- “While [the Mariners’ inconsistency] is an extreme snapshot, it’s far from isolated. Over the last 10 years [DB: 30 teams x 9 year-to-year differences = 270 ranges], there are 30 instances of teams whose relief ERAs changed by at least one run [DB: ERA: lower = better] – with 16 of them representing improvements by at least a run and 14
representing declines of at least one run [DB: half went up, half went down. Sounds average to me – as well as common cause (< 1.1)]. On average, teams saw their bullpen ERA change by 0.52 runs on a season-to-season basis over the last 10 years – meaning that a ‘normal’ ERA adjustment [from 4.24 to 3.71] could give the Sox at least an average bullpen [DB: Huh?], and with the possibility that it would be far from an
outlier for the team to improve by, say, a full run, which would in turn suggest a bullpen that had gone from a weakness to a strength." [DB: I’m reaching for the Excedrin]
Actually, Seattle is isolated -- as you will see, it was the only bullpen with more than one special cause.
"...at least an average bullpen" : the Red Sox already have an average
bullpen.
“...far from an outlier” : He’s right. Changing by a run would be common cause, i.e., not an outlier. But look at his conclusion: he implies that nothing could in essence change, yet it would now be a strength (special cause conclusion)?
If he can fish, I can fish – but more carefully and statistically. I didn’t want to rely solely on the differences between 2014 and
2015. Since he initially looked at the ERA data for 2012 to 2015, I decided to start there.
Optional technicalities. I did a quick and dirty 2-way analysis of variance (ANOVA) to see whether there were any differences by year and/or league, and there weren’t.
Bottom line. I can look at the three year-to-year differences for each team (2012 to 2013, 2013 to 2014, 2014 to 2015),
which gives me 90 ranges to work with.
Taking the average of the 90 ranges of two, R avg ~0.50. Note how close this 4-year average is to his 10-year average of 0.52 ! (consistent inconsistency?)
What he didn’t realize was that this average range by itself isn't very useful. It needs to be converted to the maximum difference between two consecutive years that is due just to common cause: R
max = 3.268 (from theory -- for use with an average range of 2) x 0.50 ~ 1.6. Two were much higher than that (Seattle 2013 to 2014 (-1.99) and Oakland 2014 to 2015 (+1.72). These need to be taken into consideration to get a more accurate answer.
[For those of you not interested in the ensuing -- and what could be perceived at times as gory -- details, skip to the Bottom
Line below]
Optional technicalities: It is standard practice to begin a process of omitting special cause ranges and recalculating until all of the remaining ranges are within common cause.
2012 2013
2014 2015
Seattle 3.39 4.58 2.59 (-1.99)
4.15
Oakland 2.94 3.22 2.91 4.63(+1.72)
1. Eliminating these two, R avg now equals 0.465 and R max = 1.52, which then flags:
Seattle
3.39 4.58 2.59 (-1.99) 4.15 (+1.56)
Houston 4.46 4.92 4.80
3.27 (-1.53)
Note the similar pattern to last newsletter when I used just the 2014 / 2015 data: Oakland, Seattle, and Houston get flagged on their 2014 / 2015 difference.
2. Eliminating these, R avg now equals 0.4398 and R max = 1.44
Milwaukee 4.66
3.19 (-1.47) 3.62 3.40
3. Eliminating Milwaukee, R avg now equals 0.4276 and R max = 1.40
Largest remaining:
Atlanta 2.76 2.46
3.31 4.69 (+1.38)
According to this analysis, 1.38 is not a special cause; but a deeper subsequent confirmatory analysis using ANOVA left little doubt that 4.69 was a special cause (just like last newsletter).
So, omitting this range, I get a final R max of
1.36.
Two anomalies of last newsletter:
- Given the 1.36, San Diego’s 2014 to 2015 difference of 1.29 seems to have been common cause.
- The previous R max of 1.1 based on 2014 / 2015 was probably low and quite variable in its estimate due to the use of only 25 ranges. This 2012 to 2015 analysis ends up using 84 ranges, which makes it more reliable and accurate.
-- Neat trick to avoid all this eliminating and
recalculating. One can alternatively use the median range of the original 90 differences at the outset as a very good initial estimate of what constitutes an outlier. In this case, R med = 0.375 and, from this, R max = 0.375 x 3.865 (from theory -- used with a median range of 2) ~ 1.43, which is very close to the final answer using the average range with successive eliminations. This
is oftentimes “one stop shopping.” Using the median range on the final data with six outliers eliminated, it matched the R avg result.
--
Using the BoxPlot analysis on the original 90 actual differences yields that any range greater than ~ 1.5 is a special cause.
Bottom line. My approach of liking to use several analyses simultaneously to seek convergence was
successful: three different simple approaches (along with some slight help of ANOVA) yield a very similar conclusion:
Two consecutive years’ ERA can have a difference of ~1.4 due just to common cause.