From Davis Balestracci – What Part of "You have NO choice!" don't Leaders and Management "Get?"

Published: Mon, 03/13/17

From Davis Balestracci – Understanding Variation: What Part of "You have NO choice!" don't Leaders and Management "Get?"

A very important lesson worth 6 to 7 minutes of your time 

Can we please stop the obsession with rankings and percentiles?

Hi, Folks,
Don't tell me you're not tempted to look when you spot a magazine cover saying "How does your state rank in [trendy topic du jour]?"

Many of these alleged analyses rank groups on several factors then compare the groups' sum totals of their respective ranks to make conclusions.

For example, in 2006, I was at a presentation by someone considered a world leader in quality (WLQ) who has been singing Dr. Deming's praises since the late 1980s. He presented the following data as a bar graph from lowest score to highest.
It is the sum of rankings for 10 aspects of 21 counties in a small country's healthcare system (considered on the cutting edge of quality). Lower sums are better:  Minimum = 10, Maximum = 210, Average = 10 x 11 = 110.

My antennae went up.  A bar graph?  With absolutely no context of variation for interpretation? And a literal interpretation of the rankings?  

What's wrong with this picture?

I wanted to use one of my favorite techniques, analysis of means (ANOM), to take a systems view of things. When looking at improvement opportunities, the mindset must change from comparisons of individual performances to comparison of individual performances relative to their inherent "system."

I wrote to him for the data and he graciously complied.

There is a statistical technique known as the Friedman test where it is legitimate to perform an analysis of variance (ANOVA) using the combined individual sets of rankings (not shown) as the responses. I won’t bore you with the details, but the p-value of this analysis was < 0.001 – little doubt that there is indeed a difference among counties.  Now...which ones?

[Only if you are interested in the statistics involved, click here]

From the ANOVA, one can calculate what is called the least significant difference (LSD) between any two summed scores due to common cause, which in this case was 51. Because of the potential of 220 potential pairwise comparisons, one can also calculate a more conservative difference to take this into account, which results in a difference as high as 91 being possible common cause.

Given the rankings and results of the two calculations above, suppose there was a subsequent meeting to discuss the rankings, possibly revise them, and then decide on how to take action. 

  • Given this information, do you think the "unknown or unknowable" effects of the variation in human perception of variation might affect the discussion and subsequent actions? 

  • Do you see the dangerous potential for treating common cause as special cause? 

  • Do they realize their actions could have serious consequences?

Oh, and those two calculated differences aren't worth very much.

The process-oriented approach:  consider these counties a "system" and use Analysis of Means (ANOM)

The ANOM results are below (overall p = 0.05 and p = 0.01 reference lines drawn in).  Note that the points are not connected and the horizontal axis order has no time element.  I chose to display them from smallest score to largest. 
Dr. Deming hated probability limits and would just use his mentor Walter Shewhart’s recommended limits of "three" standard deviations (as in the red bead experiment comparing workers) – in this case, 110 +/- 55 (55 to 165), neither a whole lot of difference in limits from the ANOM nor change in the conclusions.

The statistical interpretation:

  • There is one outstanding county (#1),

  • One county indeed "below" average in performance (#21),

  • The other 19 counties are, based on this data, indistinguishable! 

In The New Economics, Dr. Deming shows a similar chart and comments about the performance equivalent of counties 2 to 20: These cannot be ranked! 

I once analyzed a similar state ranking. There were two states truly above average, two below average – and states three through 48 were indistinguishable.

Discussions on data like this involve a lot of talk about quartiles, top or bottom 10 percent, and above- and below-average performances. Sound familiar?  (Healthcare folks:  Press-Ganey reports?)

(These special cause strategies are fruitless.  But, perhaps clusters resulting from using a common cause strategy of color coding by geography might be useful?)

I'll let you decide:  Did he "get it?"
When I shared this analysis with WLQ, I was shocked at his response. Our verbatim e-mail correspondence follows.

World Leader in Quality:  "A subtle issue you did not tackle is the political-managerial issue of communicating such insights to [the two special cause counties] and the counties that thought they were 'different,' but, statistically, aren't. I wonder what framework one could use to approach that psychological challenge."

Davis:  "As I say to my audiences, 'Hey...I'm just the statistician, Man!'

"I'm going to be very hard on you here, but I think the issue is how people and leaders like you are going to facilitate these difficult conversations...which will be profoundly different...and productive!  This is the leadership that quality gurus keep alluding to...and seems to be in very short supply.

"My job is to keep you all out of the 'data swamp'; however, I would be a very willing participant.  I have a saying, 'I'm the statistician, I know nothing.  You're the [leaders], you know too much.  That makes us a great team!'

"And I would love to pilot some of these types of analyses with you or other leaders.  We need to figure out what this process should be. This is potentially very exciting and could quantum leap the quality improvement movement.

"My point is that this 'language' needs to be a fundamental piece of any improvement process...and led by leaders who understand it and are now promoted into positions of leadership only if they understand it.  If this could become culturally inculcated, then the ongoing daily defensiveness reacting to data stops...PERIOD!

"The discussion will then focus, as it should, on process.

"I am seeing far too much concern about 'hurting people's feelings.' This would change that as well as result in having conversations leading to appropriate action.

"That's what I've been saying the last few years:  we need new conversations...and this could be a key catalyst."

WLQ:  "Nope. I don't buy it. Yes, I am a leader and need to carry the message.  But I know you too well to let you off the hook. I'd love to see you try to lead these conversations and experiment with approaches. You're a leader, too."

Davis:  "Give me an opportunity and I will do my best to lead that conversation (and feel that we could begin by co-facilitating it). Have you fathomed the potential of this?"

That last e-mail has never been answered. Here it is, 10 years later, and several follow-up gentle e-mail reminders have been ignored. I'm still waiting for that promised exciting opportunity, but I've given up any hope.  And I've had no more luck persuading any other leader to give it a try.

At his insistence, I sent the analysis with explanation to the original executive group who collected and summarized the data.  No reply.

A serious consequence for healthcare
I've done several grand rounds for various groups of doctors. When I explain ANOM and "plot the dots," just about every audience has said, "This makes sense!  If data were presented this way, we would take care of it ourselves."

Doctors and hospitals especially are currently being victimized left and right with inappropriate analyses and rankings by statistical "hacks" (Dr. Deming's term). Many of these have major influence on reimbursement. One common criterion is to penalize anyone falling into the bottom quartile of performance!  Given a set of numbers, aren't 25 percent of them naturally in the bottom quartile?

I've even seen criteria using even one standard deviation (usually calculated incorrectly) to find "those outliers."  In the data above, the resulting range (~92 to 128) would add three additional counties each to the already declared "above" and "below" average counties.

Is it any wonder why physicians are so angry?

How much variation would be reduced if ANOM could be standardized as an analysis?  A side benefit: rather than focusing on just rank, the exposure of variation in performance could result in non-defensive conversations to reduce inappropriate and unintended variation.

[By the way, WLQ is a physician]

A BASIC Deming Principle Still Fiercely Resisted

Many of this example's statistical principles are what Dr. Deming  demonstrated in his seminars (and, yes, the red bead experiment is an ANOM!).  After 30+ years of trying to teach similar things, I am still amazed at the abject cowardice (Yes, cowardice!) and resulting resistant bluster I see in (alleged) leaders abdicating their responsibility to comprehend the transforming liberating power of a simple, basic understanding of variation.  Deming had zero tolerance for such ignorance (or is it arrogance?).

Is that too much to ask of someone making a six- or seven-figure salary whose actions affect the five-figure salary folks?  

Amusing note:  My own state of Maine had a panicked headline in the newspaper a couple of weeks ago:  "Maine's ranking drops from 13th to 17th" in something or other, and the explanations and excuses started flying. I wonder on whom the blame finally fell? 

In how many meetings does this nonsense go on with their accompanying, staggering "unknown or unknowable" costs? 

Quite frankly, many people who think they "get" Dr. Deming's message don't. To deeply understand the message and its power has taken me over 30 years...and I don't do red bead experiment demonstrations  (but WLQ still does...).

Kind regards,
This is one of 10 everyday examples in Chapter 2 of my book Data Sanity that use real data routinely encountered by leaders and managers to give an overview of the awesome power of a basic understanding of variation.

Chapter 7 thoroughly covers Analysis of Means and is one of the few available resources to do so.
Data Sanity: A Quantum Leap to Unprecedented Results is a unique synthesis of the sane use of data, culture change, and leadership principles to create a road map for excellence.

One of its major goal is to create a common organizational language for healthier dialogue about reducing ongoing confusion, conflict, complexity, and chaos. ​​​​​​​

  • Amazon offers free shipping within U.S.

  • NEW:  e-book format for all e-readers, including iBook, Nook and Kindle (includes downloadable .pdf)

  • GREAT NEWS for UK readers who want a hard copy.  It is now available on Amazon UK for £ 69 with free shipping.]

My publisher has informed me that there will also be an option to print on demand in Europe, Canada and Australia He has also lowered the price a bit in these countries. [Any problems or questions, please e-mail Craig Wiberg at:]

Click here or visit my LinkedIn profile for a copy of its Preface and chapter summaries.

Please know that I always have time for you and am never too answer a question, discuss opportunities for a leadership or staff retreat, webinar, mentoring, or public speaking --  or just about any other reason!  Don't hesitate to e-mail or phone me.

Please visit my LinkedIn page to listen to a 10-minute podcast or watch a 10-minute video interview where I talk about data sanity.

Was this forwarded to you?  Would you like to sign up?
If so, please visit my web site -- -- and fill out the box in the left margin on the home page, then click on the link in the confirmation e-mail you will immediately receive.