Tag Archives: William Hunter

I Just Finished Statistics for Experimenters and I Cannot Praise it Enough

Guest post by Michael Betancourt.

I just finished Box, Hunter, and Hunter (Statistics for Experimenters) and I cannot praise it enough. There were multiple passages where I literally giggled. In fact I may have been a bit too enthusiastic about tagging quotes beyond “all models are wrong but some are useful” that I can’t share them all.

I wish someone had shared this with me when I was first learning statistics instead of the usual statistics textbooks that treat model development as an irrelevant detail. So many of the elements that make this book are extremely relevant to statistics today. Some examples:

The perspective of learning from data only through the lens of the statistical model. The emphasis on sequential modeling, using previous fits to direct better models, and sequential experiments, using past fits to direct better targeted experiments.
The fixation on checking model assumptions, especially with interpretable visual diagnostics that capture not only residuals but also meaningful scales of deviation. Proto visual predictive checks as I use them today.
The distinction between empirical models and mechanistic models, and the treatment of empirical linear models as Taylor expansions of mechanistic models with covariates as _deviations_ around some nominal value. Those who have taken my course know how important I think this is.
The emphasis that every model, even mechanistic models, are approximations and should be treated as such.
The reframing of frequentist statistical tests as measures of signal to noise ratios.
The importance of process drift and autocorrelation in data when experimental configurations are not or cannot be arbitrarily randomized.
The diversity of examples and exercises using real data from real applications with detailed contexts, including units everywhere.

Really the only reason why I wouldn’t recommend this as an absolute must read is that the focus on linear models and use of frequentist methods does limit the relevance of the text to contemporary Bayesian applications a bit.

Texts like these make me even more frustrated by the desire to frame movements like data science as revolutions that give people the justification to ignore the accumulated knowledge of applied statisticians.

Academic statistics has no doubt largely withdrawn into theory with increasingly smaller overlap with applications, but there is so much relevant wisdom in older applied statistics texts like these that doesn’t need to be rediscovered just reframed in a contemporary context.

Oh, I forgot perhaps the best part! BHH continuously emphasizes the importance of working with domain experts in the design and through the entire analysis with lots of anecdotal examples demonstrating how powerful that collaboration can be.

I felt so much less alone every time they talked about experimental designs not being implemented properly andthe subtle effects that can have in the data, and serious effects in the resulting inferences, if not taken into account.

Michael Betancourt, PhD, Applied Statistician – long story short, I am a once and future physicist currently masquerading as a statistician in order to expose the secrets of inference that statisticians have long kept from scientists. More seriously, my research focuses on the development of robust statistical workflows, computational tools, and pedagogical resources that bridge statistical theory and practice and enable scientists to make the most out of their data.
Website: betanalpha
Patreon: Michael Betancourt

Mabel Mercer sings Experiment by Cole Porter

Mabel Mercer sings Experiment by Cole Porter:

[ Video removed ðŸ™ ]

Lyrics for Experiment:

Before you leave these portals to meet less fortunate mortals,
There’s just one final message I would give to you.
You all have learned reliance on the sacred teachings of science
So I hope through life you never will decline in spite of philistine defiance
To do what all good scientists do.
Experiment.
Make it your motto day and night.
Experiment and it will lead you to the light.
The apple on the top of the tree is never too high to achieve,
So take an example from Eve, experiment.
Be curious, though interfering friends may frown,
Get furious at each attempt to hold you down.
If this advice you only employ, the future can offer you infinite joy
And merriment.
Experiment and you’ll see.

The lyrics were included in the book by George Box, my father and Stu Hunter: Statistics for Experimenters.

Introduction to Fractional Factorial Designed Experiments

Scientific inquiry is aided by sensible application of statistical tools. I grew up around the best minds in applied statistics. My father was an eminent applied statistican, and George Box (the person in the video) was often around our house (or we were at his). Together they wrote Statistics for Experimenters (along with Stu Hunter, not related to me) the bible for design of experiments (George holds up the 1st edition in the video).

The video may be a bit confusing without at least a basic idea of factorial designed experiments. These introductory videos, by Stu Hunter, on Using Design of Experiments to Improve Results may help get you up to speed.

This video looks at using fractional factorials to reduce the number of experiments needed when doing a multifactor experiment. I grew up understanding that the best way to experiment is by varying multiple factors at the same time. You learn much quicker than One Factor At a Time (OFAT), and you learn about interactions (which are mainly lost in OFAT). I am amazed to still hear scientists and engineers talk about OFAT as a sensible method or even as the required method, but I know many do think that way.

To capture the interactions a full factorial requires an ever larger number of experimental runs to be complete. Assessing 4 factors requires 16 runs, 6 would require 64 and 8 would require 256. This can be expensive and time consuming. Obviously one method is to reduce the number of factors to experiment with. That is done (by having those knowledgable about the process include only those factors worth the effort), but if you still have, for example, 8 very important factors using a fractional factorial design can be very helpful.

And as George Box says “What you will often find is that there will be redundant factors… and don’t forget about those redundant factors. Knowing that something doesn’t matter is almost as important as knowing what does.” If you learn a factor isn’t having an affect you may be able to save money. And you can eliminate varying that factor in future experiments.

Continue reading →

William G. Hunter Award 2008: Ronald Does

The recipient of the 2008 William G. Hunter Award is Ronald Does. The Statistics Division of the American Society for Quality (ASQ) uses the attributes that characterize Bill Hunter’s (my father – John Hunter) career – consultant, educator for practitioners, communicator, and integrator of statistical thinking into other disciplines to decide the recipient. In his acceptance speech Ronald Does said:

The first advice I received from my new colleagues was to read the book by Box, Hunter and Hunter. The reason was clear. Because I was not familiar with industrial statistics I had to learn this from the authors who were really practicing statisticians. It took them years to write this landmark book.
…
For the past 15 years I have been the managing director of the Institute for Business and Industrial Statistics. This is a consultancy firm owned by the University of Amsterdam. The interaction between scientific research and the application of quality technology via our consultancy work is the core operating principle of the institute. This is reflected in the type of people that work for the institute, all of whom are young professionals having strong ambitions in both the academic world and in business and industry.
…
The kickoff conference attracted approximately 80 statisticians and statistical practitioners from all over Europe. ENBIS was officially founded in June 2001 as “an autonomous Society having as its objective the development and improvement of statistical methods, and their application, throughout Europe, all this in the widest sense of the words” Since the first meeting membership has grown to about 1300 from nearly all European countries.

Playing Dice and Children’s Numeracy

My father, Willaim Hunter, a professor of statistics and of Chemical Engineering at the University of Wisconsin, was a guest speaker for my second grade class (I think it was 2nd) to teach us about numbers – using dice. He gave every kid a die. I remember he asked all the kids what number do you think will show up when you roll the die. 6 was the answer from about 80% of them (which I knew was wrong – so I was feeling very smart).

Then he had the kids roll the die and he stood up at the front to create a frequency distribution of what was actually rolled. He was all ready for them to see how wrong they were and learn it was just as likely for any of the numbers on the die to be rolled. But as he asked each kid about what they rolled something like 5 out of the first 6 said they rolled a 6. He then modified the exercise a bit and had the kid come up to the front and roll the die on the teachers desk. Then my Dad read the number off the die and wrote on the chart ðŸ™‚

This nice blog post, reminded me of that story: Kids’ misconceptions about numbers — and how they fix them

in the real study, conducted by John Opfer and Rober Siegler, the kids used lines with just 0 and 1000 labeled. They were then given numbers within that range and asked to draw a vertical line through the number line where each number fell (they used a new, blank number line each time). The figure above represents (in red) the average results for a few of the numbers used in the study. As you can see, the second graders are way off, especially for lower numbers. They typically placed the number 150 almost halfway across the number line! Fourth graders perform nearly as well as adults on the task, putting all the numbers in just about the right spot.

But there’s a pattern to the second-graders’ responses. Nearly all the kids (93 were tested) understood that 750 was a larger number than 366; they just squeezed too many large numbers on the far-right side of the number line. In fact, their results show more of a logarithmic pattern than the proper linear pattern.