From Davis Balestracci -- Can you prove anything with statistics? Maybe...using PARC

Published: Mon, 11/09/15

From Davis Balestracci --
Can you prove anything with statistics? Maybe...using PARC
"First, what an impressive class.  Your strategies will save countless hours of reporting on and explaining meaningless data and allow us to spend time zeroing in on what matters. Thanks so much!" -- all-day seminar participant (and MBA) at the MGMA national meeting on 10 October 2015.

[~990 words -- take 3 to 4 minutes to read over a break or lunch]

"It is impossible to tell how widespread data torturing is. Like other forms of torture it leaves no incriminating marks when done skillfully. And like other forms of torture, it may be difficult to prove even when there is incriminating evidence." -- JL Mills*

*October 14, 1993 edition of the New England Journal of Medicine

Hi, Folks,
When will academics, Six Sigma belts and consultants wake up and realize that, despite their best efforts, most people in their audiences will not correctly use the statistics they’ve been taught – including many of the aforementioned teachers?

Sometimes I wonder if they are exacting revenge on their captive audiences for being beaten up on the playground 25 years ago.

The clinical publications world is especially a hotbed for inappropriate uses of statistics.  Many people are guilty of looking for the most dramatic, positive findings in their data, and who can blame them? If study data are manipulated enough, they can be made to appear to prove whatever the investigator wants to prove.  When this process goes beyond reasonable interpretation of the facts, it becomes data torturing.  

Two Types of Torture

1. Opportunistic – (a) poring over data not collected specifically for the current purpose until an alleged statistically significant association is found between variables and then (b) devising a plausible hypothesis to fit the association.

One can easily find significant results where none exist simply by making multiple comparisons.  Using the widely accepted p-value of 0.05 (i.e., willingness to take a 5 percent risk of declaring something is significant when it isn’t), more comparisons means more opportunities for random events to be declared significant due just to chance.  For two tests the probability that at least one "significant" difference could possibly be declared randomly is 10 percent (1 – (0.95 x 0.95)). For 20 tests, it is 64 percent (1 - 0.95**20)).

If one is on a “fishing expedition” with such a data set – once again, I emphasize that this term applies only to data (usually a tabulation) that wasn’t collected specifically for the current purpose – one should at least adjust decision criteria to make the overall risk 5 percent.  This significance value is dependent on the number of possible comparisons.  There are several ways to do this, but one example would say that the threshold to declare significance for two potential individual comparisons should each be p < 0.025.  Similarly, for 20 comparisons, this would be p < 0.0036.

Further, if the fishing expedition catches a boot, the fishermen should throw it back and not claim that they were fishing for boots. The honest investigator will limit the study to focused questions, all of which make sense in the given context -- which can then be subsequently tested with an appropriately designed study.  The data torturer will act as if every positive result confirmed a major hypothesis.

Unfortunately, when this type of data torturing is done well, it may be impossible for readers to tell that the positive association did not spring from an a priori hypothesis.

2. Procrustean - deciding on the hypothesis to be proved, then making the data fit the hypothesis.

This requires selective reporting, one of the most common being the intentional suppression of contradictory data.  It is more difficult to carry out than opportunistic data torturing, but its results are often more believable if one starts with a popular hypothesis that appears to have been “proven.”

One should suspect data torturing whenever subjects are dropped without a clear reason, or when a large proportion of subjects are excluded for any reason. Ask:  Is the rationale for the subgroup analyses convincing?

In the case of medicine, remember that two sexes, multiple age groups, and different clinical features such as stages of disease make it possible for the investigators to examine the data in many different ways.

If a drug is reported as working only in women over 60 years of age, the savvy reader should at least suspect a chance finding.

Do a PARC Analysis and you get...

The delightful applied-science statistician J. Stuart Hunter invented the term PARC to characterize a lot of what is being taught and practiced: “practical accumulated records compilation” on which one does a “passive analysis by regressions and correlations” and, to get it published, one must now do the “planning after the research is already completed.” With the current plethora of friendly computer packages that are designed to "delight" their customers, I have also coined the characterization “profound analysis relying on computers.”

Here is an enlightening quote by Walter A. Shewhart from his classic book The Economic Control of Quality Manufactured Product:

“You go to your tailor for a suit of clothes and the first thing that he does is to make some measurements; you go to your physician because you are ill and the first thing he does is to make some measurements. The objects of making measurements in these two cases are different. They typify the two general objects of making measurements. They are:
  • "To obtain quantitative information" [only]
  • "To obtain a causal explanation of observed phenomena.”

These are two entirely different purposes. For example, when I’m being fitted for a suit, I don’t expect my tailor to take my waist measurement, then ask, “I need to know whether your mother has or had Type II diabetes.” The tailor doesn’t care about the genetic process that produced my body—he or she just measures it (once), then makes my suit.

I vividly remember a newspaper article that appeared when I lived in Minnesota more than 20 years ago:  “Whites May Sway TV Ratings.” It read:

“… [An] associate professor and Chicago-based economist reviewed TV ratings of 259 basketball games...They attempted to factor out all other variables such as the win-loss records of teams and the times games were aired  [DB emphasis]…. The economists concluded that every additional 10 minutes of playing time by a white player increases a team’s local ratings by, on average, 5,800 homes.”

Hence, Minnesotans are bigots!  What do you think?

Isn't the objective of TV ratings solely to find out how many people watched a particular show (i.e., “making a suit”)…period?  Is the data collecting agency trying to determine racial viewing patterns during basketball games (i.e., causal explanation)?   Hardly!

When “data for a suit” (most tabulated statistics) are used to make a causal inference, that’s asking for trouble.   This is why a lot of published research is, in essence, PARC spelled backwards -- which was Hunter’s ultimate point.  People are doing PARC analyses on data that are the "continuous recording of administrative procedures."

Speaking of data torturing, when are teachers of statistics going to stop torturing their students as well?

Until next time…

Kind regards,

P.S. Chapters 2, 6 and 7 of Data Sanity demonstrate elegantly simple, insightful alternatives to data torturing
Feedback like that obtained at the beginning of today's newsletter has me firmly convinced that a  one to two day leadership retreat  with safe dialogue using the content of Chapters 1 to 4 & 9 is key to getting cultures "unstuck" in their quest for excellence.

Do you need a plenary speaker for an internal or professional conference  or some mentoring to help you "quantum leap" to a new level of eye-opening effectiveness?

As always, I welcome contact from my readers with comments or to answer any questions.
( )

Was this forwarded to you?  Would you like to sign up?
If so, please visit my web site -- -- and fill out the box in the left margin on the home page, then click on the link in the confirmation e-mail you will immediately receive.

Want a concise summary of Data my own words?
Listen to my 10-minute podcast. Go to the bottom left of this .