From Davis Balestracci -- People LOVE to Rank Things

Published: Mon, 03/15/10


From Davis Balestracci:  People LOVE to Rank Things


What Part of "You have NO choice" Doesn't Management Understand?
[~700 words:  Take 3-5 minutes to read it over a break or lunch. For those of you trying to understand the statistics involved (I get a bit technical, which most of you can ignore), it will probably take a little longer.]
 
Please also note that this e-mail contains a Word document attachment.
 
Hi, Folks,
In keeping with my theme that executives and management MUST understand variation, this and the next two newsletters will deal with some very basic concepts, a couple of techniques...and a surprising reaction to this "funny new way" of looking at things. I get a bit "statistical," but it's not so much understanding the technique as understanding the resulting analysis.  However, many of you will see the need for this type of analysis and should be ready if an appropriate opportunity pops up in everyday work.
 
Something I've encountered quite often for a comparison is to rank groups on several factors then use the sum of their respective ranks to come to conclusions...after presenting the bar graph, of course.

 

Data from a Don Berwick Talk (2006)
The (real) data below is the sum of rankings for 10 aspects of 21 counties' healthcare systems (Lower sums are better: 
Minimum = 10, Maximum = 210, Average = 10 x 11 = 110).
 
                              Rank Sum    County
                                     42            1
                                     76            2
                                     84            3
                                     87            4
                                     92            5
                                     99            6
                                   101            7
                                   102            8
                                   105            9
                                   105          
10
                                   107          11
                                   108          12
                                   112          13
                                   113          14
                                   114          15
                                   121          16
                                   128          17
                                   131          18
                                   145          19
                                   157          20
                                   181          21
          (And, yes, Berwick presented this as a bar graph -- See ATTACHMENT to this e-mail)
 
Statistical Technicalities -- But Bear with Me
There is a statistical technique known as the Friedman test where it is legitimate to perfom an analysis of variance (ANOVA) using the combined individual sets of rankings (not shown) as the responses.   The results for this data are shown below
(10 "Item"s, 21 "County"s.  Note:  Because of the nature of rankings, it always results in a Sum of Squares (SS) of zero for "Item"):
 
Analysis of Variance for Ranks
 
Source   DF         SS         MS           F         P
Item          9         0.00       0.00       0.00     1.000
County    20    1702.80      85.14      2.56     0.001
Error      180    5997.20     33.32
Total      209    7700.00
 
For statistical interpretation, Conover (reference below) uses the actual F-test above for "County" (F = 2.56, p = 0.001).

[Informational aside:  Conover claims this to be superior to the more commonly used Chi-square statistic given in most computer packages, which can be approximated from the ANOVA by multiplying SS for "T" (the number of things being compared, in this case "County" (21)) by the factor "12 / [T x (T+1)]," with (T-1) degrees of freedom.
For this example, the result is a Chi-square of [12/(21 x 22)] x 1702.8 = 44.23 with 20 degrees of freedom (p = 0.0014)].
 
Technicalities such as this aside, there is little doubt that there is indeed a difference amongst counties.  Now...which ones?
 
Since the F-test is significant -- and only because it is significant -- one could calculate what is called the least significant difference (LSD) between any two summed scores (In this case, ~51) or the more conservative difference based on what is known as the interquartile range (In this case, ~91). Given these, the summed rankings can be "discussed" (with lots of "human" variation) [and please notice that I'm purposely not showing you how to calculate these], or, preferably (Hint, hint)...
 
...one can use Analysis of Means, resulting in the 2nd graph in the  ATTACHED Word document [Overall p = 0.05 and p = 0.01 reference lines drawn in (See Ott (below)) or one could just use "three" standard deviations (a future newsletter), resulting in 110 +/- 55].
 
[All of these calculations are based on the standard deviation of the sum of the 10 rankings, which derives from the MSerror term from the ANOVA.  It equals, given "k" items summed: 
 
Square root(k x MSerror)] = Square root(10 x 33.32) = 18.25)
(Aren't you glad you asked?)]
 
Given this graph (attachment), the statistical interpretation would be that there is one outstanding county (#1), one county indeed "below" average in performance (#21), and the other 19 counties are, based on this data...indistinguishable!

Previous discussions involving this data involved a lot of talk
about "quartiles" and "above" and "below" average;  however,  when I presented this analysis to the involved executives as an
alternative, it was met with...a stunned silence...and defensiveness, which I will address next time in a more "philosophical" newsletter -- once again motivating the need for having NO choice but to look at data this way.
 
Until then...
 
Kind regards,
Davis
 
Conover WJ.  Practical Nonparametric Statistics, 3rd Edition.
John Wiley & Sons, 1998.
 
Ott ER, Schilling EG, and Neubauer DV.  Process Quality Control:
Troubleshooting and Interpretation of Data
, 4th edition.  ASQ
Quality Press, 2005.
 
========================================================
P.S.  For those of you wanting some pointers on "deeper" statistical analysis
========================================================
This example is discussed thoroughly in Appendix 8A of my book, Data Sanity:  A Quantum Leap to Unprecedented Results.  In that Appendix, I also demonstrate other deeper analyses...that are not for the fainthearted.  But, that's OK:  The rest of the book (and my general approach) espouses using simple graphs and trusting your intuition rather than "turn the crank" (mysterious) statistical analyses done by packages!
 
You can order Data Sanity:  A Quantum Leap to Unprecedented Results through the link below (Immediate shipping): 
 
http://www5.mgma.com/ecom/Default.aspx?action=INVProductDetails&args=3785&tabid=138
 
[If you have any problems with this link, please contact either me(davis@dbharmony.com) or my publishing contact, Marilee Aust
(
maust@mgma.com)-- She's wonderful to deal with!].  Some foreign
subscribers have had trouble entering their information, so Marilee told me to tell you that she is delighted to help anyone on an individual basis.  She has already dispatched copies to Seattle, Fargo, England, Wales, Australia, and New Zealand.]
 
It is also now available through Amazon.com (with free shipping!):
 
 
(Thanks, Dr. Steve Tarzynski, Dean Spitzer, and Adam Lennox for your
very kind reviews).
 
========================================================
Was this forwarded to you?  Would you like to sign up?
========================================================
If so, please visit my web site --
www.dbharmony.com -- and fill
out the box on the home page then click on the link in the
confirmation e-mail you immediately receive.