From Davis Balestracci -- "90 percent of design of experiments is half planning."

Published: Mon, 04/11/16

From Davis Balestracci --
"90 percent of design of experiments is half planning." *
* Based on Yogi Berra's famous:  Baseball is 90 percent mental.  The other half is physical.

[~ 1300 words but a breezy read. Take 5-1/2 to 6-1/2 minutes to read over a break or lunch ]


Don't just teach people statistics. 
Teach them to solve their problems.

 
Hi, Folks,
Since my last newsletter mentioned design of experiments (DOE) as one of the few things worth salvaging from typical statistical training, I thought I’d talk a bit about it over the next couple of newsletters.

The needed discipline for a good design is similar when using rapid cycle PDSA

Doing a search on the current state of DOE in in improvement education, I observed that curricula haven’t seemed to change much the last 10 years or so and still seem to favor factorial designs and/or orthogonal arrays as a panacea.  

The main topics for many basic courses remain:

  • Full and fractional factorial designs
  • Screening designs
  • Residual analysis and normal probability plots
  • Hypothesis testing and analysis of variance (ANOVA)

The main topics for advanced DOE courses usually include:

  • Taguchi signal-to-noise ratio
  • Taguchi approach to experimental design
  • Response-surface designs
  • Hill climbing
  • Mixture designs

No doubt these are all very interesting.  But what is the 20 percent of this material that will solve 80 percent of people’s problems?   Some of the topics above are very specialized, rarely used, and can only be understood when people have a practical working knowledge of the other material after actually using it.  

Many trainers also fall into the trap of thinking that hypothesis testing and ANOVA should be taught as separate topics.  A well-respected statistical colleague says it so well [my emphasis]:


"I get [questions about degrees of freedom (DOF)] all the time (ANOVA tables in particular seem to terrorize people)...but I wish people were asking better questions about the problem they're trying to understand/solve, the quality of the data they're collecting/crunching, and what on earth they're actually going to do with the results and their conclusions. In a well-meaning attempt not to turn away ANY statistical questions, my own painful attempts to explain DOF have only served to distract the people who are asking from what they really should be thinking about."

A basic knowledge of full and fractional factorial designs, screening designs, and their analysis / diagnostics is a good place to start.  This knowledge, though necessary and useful when one is at a low state of knowledge about one’s process, is not sufficient.  It usually needs to be supplemented by some basic, extremely useful designs from response surface methodology (RSM).

[There is no finer reference for a process-oriented approach to DOE than Moen, Nolan, and Provost‘s  Quality Improvement through Planned Experimentation.  RSM is not covered.]

When possible, get a "road map"


In my industrial career, many of my clients found much more ultimate value in obtaining a process road map – called a contour plot – which is accomplished through RSM. In its basic form it is hardly an advanced technique;  but, it does go a bit beyond factorial designs.  Many times RSM can even build on factorial designs in a nice efficient sequential strategy as one evolves to a higher state of knowledge, which leads to much more effective optimization and process control.   

A typical contour plot is below (scenario explained shortly).  It shows how the predictive model from the design analysis can be turned into a road map of the process studied.  Temperature (x-axis) and an ingredient’s concentration (y-axis) were varied over the ranges on their respective axes.  For any combination of those two variables, one can read the predicted value of the response being studied (objective in this case is to minimize it).
However, this map is never fully known and can only be approximated. The question becomes, what is your best shot at doing this in as few experiments as possible?  First, some background…
 

People HATE Variation -- but that doesn't make it go away


The contour plot above maps a real production process where the desired product immediately decomposes into a pesky, tarry byproduct that is difficult and expensive to remove.  The process is currently averaging approximately 15 percent tar, and each achievable percent reduction equates to $1 million (1970 dollars) in annual savings.

Process history has determined three variables to be crucial for process control:  temperature, copper sulfate concentration, and excess nitrite. Any combination of these three variables within the ranges of  temperature, 55-65° C;  copper sulfate concentration, 26-31 percent;  and excess nitrite, 0-12 percent would represent a safe and economical operating condition. The current operating condition is the midpoint of these ranges.

For purposes of experimentation only, the equipment is capable of operating in the following ranges if necessary:   temperature, 50-70° C;  copper sulfate concentration, 20-35 percent; and excess nitrate, 0-20 percent.

Suppose you had a budget of 25 (expensive) experiments that need to answer these questions:

  • Where should the three variables be set to minimize tar production?

  • What percent tar would be expected?

  • What's the best estimate of the process variation (i.e., tar ± x percent)?

This is the scenario I use to introduce my experimental design seminars. I form groups of three to four people and give them a process simulator where they can enter any condition and get the resulting percent tar. 

It almost never fails:  I get as many different answers for optimum settings, resulting tar, and variation as there are groups in the room – and just as many strategies (and number of experiments run) for reaching their conclusions.   Human variation!

I have each group present its results to me, and I act like the many mercilessly tough managers to whom I have made similar presentations.

General observations:

  • Most try holding two of the variables constant while varying the third and then try to further optimize by varying the other two around their best result.

  • Each experiment seems to be run based only on the previous result.

  • Some look at me smugly and run the cube of a 3-variable factorial design (many times getting the worst answers in the room).

  • Some run more than the allotted 25 experiments.

  • Some go outside of the established variable safe ranges.

  • Most find a good result and then try a finer and finer grid to further optimize

  • There is always one group who claims to have optimized in less than 10 experiments, and they (and everyone else) look at me like I'm nuts when I tell them:
    • They should repeat their alleged optimum (and it uses up an experiment).
    • Repeating any condition uses up an experiment.

  • I'm accused of horrible things when the repeated condition gets a different answer (sometimes differing by as much as 11-14).   I simply ask, "If you run a process at the same conditions on two different days, do you get the same results?"
 
What usually happens as a result:  

  • I'm often told the "process is out of control," so there's no use experimenting.

  • Most estimates of process variability are naively low.

  • Groups have no idea how to present results in a way that would sell them to a tough manager.

  • The suggested optimal excess nitrate settings are all over the range of 0-12 -- even though it is modeled to have no effect and should be set to zero.

My simulator generates the true number from the actual process map above along with a random, normally distributed variation that has a standard deviation of four (the actual process had a standard deviation of eight!).  In looking at the contour plot, tar is minimized at 65° C and approximately 28.8 percent CuSO4, resulting in six to eight percent tar +  ~8-10 for any production run.

In 1983, I heard the wonderfully practical C.D. Hendrix say:  People tend to invest too many experiments in the wrong place!

As it turns out, by the end of the class, human variation is minimized when every group independently agrees on the same 15 experiment strategy (a few choose an alternative, equally effective 20 experiment strategy).  When they see quite different numerical results from each individual design, they are initially leery, but then pleasantly surprised when they all get pretty close to the real answer. 

Reduced human variation = higher quality, more consistent results in only 15-20 experiments!  They are now in "the right place" and have 5-10 more experiments to refine their optimum.

More next time...

Kind regards,
Davis
=============================================================================
Data Sanity doesn't cover DOE specifically, but its data philosophy will certainly help you PLAN better!
=============================================================================
Data Sanity: A Quantum Leap to Unprecedented Results is a unique book that synthesizes the sane use of data, culture change, and leadership principles into a different type of road map than the one shown in today's newsletter -- one with the destination of excellence because of reduced human variation.  Click here for ordering information [Note:  an e-edition is available] or here for a copy of its Preface and chapter summaries (fill out the form on the page).

[UK and other international readers who want a hard copy:  ordering through U.S. Amazon is your best bet]

Listen to a 10-minute podcast or watch a 10-minute video interview at the bottom of my home page where I talk about data sanity: www.davisdatasanity.com .



=================================================================
"No one calls Davis any more.  He's too busy."
(adapting the famous  Yogi-ism:  Nobody goes there any more.  It's too crowded.)
=================================================================
It amuses me how many people call me and are shocked when I actually answer the phone. Then they say, "I hate bothering you because I know how busy you are."  Really?  How do they know that?    Besides, I like what I do too much -- in 35 years it has been only the very rare occasion when I have ever declared myself "too busy." I give them as much time as they need.

Please know that I always have time for you and am never too busy.to answer a question, discuss opportunities for a leadership or staff retreat, webinar, mentoring, or public speaking --  or just about any other reason! 

=========================================================
Was this forwarded to you?  Would you like to sign up?
=========================================================
If so, please visit my web site -- www.davisdatasanity.com -- and fill out the box in the left margin on the home page, then click on the link in the confirmation e-mail you will immediately receive.