Posts about statistics

I Just Finished Statistics for Experimenters and I Cannot Praise it Enough

Guest post by Michael Betancourt.

I just finished Box, Hunter, and Hunter (Statistics for Experimenters) and I cannot praise it enough. There were multiple passages where I literally giggled. In fact I may have been a bit too enthusiastic about tagging quotes beyond “all models are wrong but some are useful” that I can’t share them all.

photo of Statistics for Experimenters with many blue bookmarks shown

I wish someone had shared this with me when I was first learning statistics instead of the usual statistics textbooks that treat model development as an irrelevant detail. So many of the elements that make this book are extremely relevant to statistics today. Some examples:

  • The perspective of learning from data only through the lens of the statistical model. The emphasis on sequential modeling, using previous fits to direct better models, and sequential experiments, using past fits to direct better targeted experiments.
  • The fixation on checking model assumptions, especially with interpretable visual diagnostics that capture not only residuals but also meaningful scales of deviation. Proto visual predictive checks as I use them today.
  • The distinction between empirical models and mechanistic models, and the treatment of empirical linear models as Taylor expansions of mechanistic models with covariates as _deviations_ around some nominal value. Those who have taken my course know how important I think this is.
  • The emphasis that every model, even mechanistic models, are approximations and should be treated as such.
  • The reframing of frequentist statistical tests as measures of signal to noise ratios.
  • The importance of process drift and autocorrelation in data when experimental configurations are not or cannot be arbitrarily randomized.
  • The diversity of examples and exercises using real data from real applications with detailed contexts, including units everywhere.

Really the only reason why I wouldn’t recommend this as an absolute must read is that the focus on linear models and use of frequentist methods does limit the relevance of the text to contemporary Bayesian applications a bit.

Texts like these make me even more frustrated by the desire to frame movements like data science as revolutions that give people the justification to ignore the accumulated knowledge of applied statisticians.

Academic statistics has no doubt largely withdrawn into theory with increasingly smaller overlap with applications, but there is so much relevant wisdom in older applied statistics texts like these that doesn’t need to be rediscovered just reframed in a contemporary context.

Oh, I forgot perhaps the best part! BHH continuously emphasizes the importance of working with domain experts in the design and through the entire analysis with lots of anecdotal examples demonstrating how powerful that collaboration can be.

I felt so much less alone every time they talked about experimental designs not being implemented properly andthe subtle effects that can have in the data, and serious effects in the resulting inferences, if not taken into account.

Michael Betancourt, PhD, Applied Statistician – long story short, I am a once and future physicist currently masquerading as a statistician in order to expose the secrets of inference that statisticians have long kept from scientists. More seriously, my research focuses on the development of robust statistical workflows, computational tools, and pedagogical resources that bridge statistical theory and practice and enable scientists to make the most out of their data.
Twitter: @betanalpha
Website: betanalpha
Patreon: Michael Betancourt

Related: Statistics for Experimenters, Second EditionStatistics for Experimenters in SpanishStatistics for Experimenters ReviewCorrelation is Not Causation

Medical Study Findings too Often Fail to Provide Us Useful Knowledge

There are big problems with medical research, as we have posted about many times in the past. A very significant part of the problem is health care research is very hard. There are all sorts of interactions that make conclusive results much more difficult than other areas.

But failures in our practices also play a big role. Just poor statistical literacy is part of the problem (especially related to things like interactions, variability, correlation that isn’t evidence of causation…). Large incentives that encourage biased research results are a huge problem.

Lies, Damned Lies, and Medical Science

He discovered that the range of errors being committed was astonishing: from what questions researchers posed, to how they set up the studies, to which patients they recruited for the studies, to which measurements they took, to how they analyzed the data, to how they presented their results, to how particular studies came to be published in medical journals. The systemic failure to do adequate long term studies once we approve drugs, practices and devices are also a big problem.

This array suggested a bigger, underlying dysfunction, and Ioannidis thought he knew what it was. “The studies were biased,” he says. “Sometimes they were overtly biased. Sometimes it was difficult to see the bias, but it was there.” Researchers headed into their studies wanting certain results—and, lo and behold, they were getting them. We think of the scientific process as being objective, rigorous, and even ruthless in separating out what is true from what we merely wish to be true, but in fact it’s easy to manipulate results, even unintentionally or unconsciously. “At every step in the process, there is room to distort results, a way to make a stronger claim or to select what is going to be concluded,” says Ioannidis. “There is an intellectual conflict of interest that pressures researchers to find whatever it is that is most likely to get them funded.”

Another problem is that medical research often doesn’t get the normal scientific inquiry check of confirmation research by other scientists.

Most journal editors don’t even claim to protect against the problems that plague these studies. University and government research overseers rarely step in to directly enforce research quality, and when they do, the science community goes ballistic over the outside interference. The ultimate protection against research error and bias is supposed to come from the way scientists constantly retest each other’s results—except they don’t. Only the most prominent findings are likely to be put to the test, because there’s likely to be publication payoff in firming up the proof, or contradicting it.

Related: Statistical Errors in Medical StudiesMedical Study Integrity (or Lack Thereof)Contradictory Medical Studies (2007)Does Diet Soda Result in Weight Gain?

Mabel Mercer sings Experiment by Cole Porter

Mabel Mercer sings Experiment by Cole Porter:

[ Video removed 🙁 ]

Lyrics for Experiment:

Before you leave these portals to meet less fortunate mortals,
There’s just one final message I would give to you.
You all have learned reliance on the sacred teachings of science
So I hope through life you never will decline in spite of philistine defiance
To do what all good scientists do.
Experiment.
Make it your motto day and night.
Experiment and it will lead you to the light.
The apple on the top of the tree is never too high to achieve,
So take an example from Eve, experiment.
Be curious, though interfering friends may frown,
Get furious at each attempt to hold you down.
If this advice you only employ, the future can offer you infinite joy
And merriment.
Experiment and you’ll see.

The lyrics were included in the book by George Box, my father and Stu Hunter: Statistics for Experimenters.

Related: Scientists Singing About ScienceHere Comes Science by They Might Be GiantsThey Will Know We are Christians By Our LoveCambrian Explosion Song

Introduction to Fractional Factorial Designed Experiments

Scientific inquiry is aided by sensible application of statistical tools. I grew up around the best minds in applied statistics. My father was an eminent applied statistican, and George Box (the person in the video) was often around our house (or we were at his). Together they wrote Statistics for Experimenters (along with Stu Hunter, not related to me) the bible for design of experiments (George holds up the 1st edition in the video).

The video may be a bit confusing without at least a basic idea of factorial designed experiments. These introductory videos, by Stu Hunter, on Using Design of Experiments to Improve Results may help get you up to speed.

[the video has been removed from the internet]

This video looks at using fractional factorials to reduce the number of experiments needed when doing a multifactor experiment. I grew up understanding that the best way to experiment is by varying multiple factors at the same time. You learn much quicker than One Factor At a Time (OFAT), and you learn about interactions (which are mainly lost in OFAT). I am amazed to still hear scientists and engineers talk about OFAT as a sensible method or even as the required method, but I know many do think that way.

To capture the interactions a full factorial requires an ever larger number of experimental runs to be complete. Assessing 4 factors requires 16 runs, 6 would require 64 and 8 would require 256. This can be expensive and time consuming. Obviously one method is to reduce the number of factors to experiment with. That is done (by having those knowledgable about the process include only those factors worth the effort), but if you still have, for example, 8 very important factors using a fractional factorial design can be very helpful.

And as George Box says “What you will often find is that there will be redundant factors… and don’t forget about those redundant factors. Knowing that something doesn’t matter is almost as important as knowing what does.” If you learn a factor isn’t having an affect you may be able to save money. And you can eliminate varying that factor in future experiments.

Continue reading

George Box 1919 to 2013 – A Great Friend, Scientist and Statistician

Reposted from my management blog.

I would most likely not exist if it were not for George Box. My father took a course from George while my father was a student at Princeton. George agreed to start the Statistics Department at the University of Wisconsin – Madison, and my father followed him to Madison, to be the first PhD student. Dad graduated, and the next year was a professor there, where he and George remained for the rest of their careers.

George died today, he was born in 1919. He recently completed An Accidental Statistician: The Life and Memories of George E. P. Box which is an excellent book that captures his great ability to tell stories. It is a wonderful read for anyone interested in statistics and management improvement or just great stories of an interesting life.

photo of George EP Box

George Box by Brent Nicastro.

George Box was a fantastic statistician. I am not the person to judge, but from what I have read one of the handful of most important applied statisticians of the last 100 years. His contributions are enormous. Several well know statistical methods are known by his name, including:

George was elected a member of the American Academy of Arts and Sciences in 1974 and a Fellow of the Royal Society in 1979. He also served as president of the American Statistics Association in 1978. George is also an honorary member of ASQ.

George was a very kind, caring and fun person. He was a gifted storyteller and writer. He had the ability to present ideas so they were easy to comprehend and appreciate. While his writing was great, seeing him in person added so much more. Growing up I was able to enjoy his stories often, at our house or his. The last time I was in Madison, my brother and I visited with him and again listened to his marvelous stories about Carl Pearson, Ronald Fisher and so much more. He was one those special people that made you very happy whenever you were near him.

George Box, Stuart Hunter and Bill Hunter (my father) wrote what has become a classic text for experimenters in scientific and business circles, Statistics for Experimenters. I am biased but I think this is acknowledged as one of (if not the) most important books on design of experiments.

George also wrote other classic books: Time series analysis: Forecasting and control (1979, with Gwilym Jenkins) and Bayesian inference in statistical analysis. (1973, with George C. Tiao).

George Box and Bill Hunter co-founded the Center for Quality and Productivity Improvement at the University of Wisconsin-Madison in 1984. The Center develops, advances and communicates quality improvement methods and ideas.

The Box Medal for Outstanding Contributions to Industrial Statistics recognizes development and the application of statistical methods in European business and industry in his honor.

All models are wrong but some are useful” is likely his most famous quote. More quotes By George Box

A few selected articles and reports by George Box

Related: It is not about proving a theorem it is about being curious about thingsBox on QualitySoren BisgaardLearning Design of Experiments with Paper HelicoptersPeter Scholtes

Cancer Risks From Our Food

comic showing the dangers of drawing false conclusion based on statistical significance

Randall Munroe illustrates RA Fisher’s point that you must think to draw reasonable conclusions from data. Click the image to see the full xkcd comic.

Pretty much everything you eat is associated with cancer. Don’t worry about it. by Sarah Kliff

The changes in cancer risk were all over the map: 39 percent found an increased risk, 33 percent found a decreased risk and 23 percent showed no clear evidence either way.

The vast majority of those studies, Schoenfeld and Ioannidis found, showed really weak associations between the ingredient at hand and cancer risk. A full 80 percent of the studies had shown statistical relationships that were “weak or nominally significant,” as measured by the study’s P-values. Seventy-five percent of the studies purporting to show a higher cancer risk fell into this category, as did 76 percent of those showing a lower cancer risk.

Sadly the evidence is often not very compelling but creates uncertainly in the public. Poorly communicated results and scientific illiteracy (both from publishers and the public) leads to more confusion than is necessary. Even with well done studies, good communication and a scientifically literate population nutrition and human health conclusion are more often questionable than they are clear.

Related: Researchers Find Switch That Allows Cancer Cells to SpreadGlobal Cancer Deaths to Double by 2030Physical Inactivity Leads to 5.3 Million Early Deaths a Year

Medical Studies Showing Largest Benefits Often Prove to be False

There is another study showing the results of health studies often are proven false. Medical studies with striking results often prove false

If a medical study seems too good to be true, it probably is, according to a new analysis.

In a statistical analysis of nearly 230,000 trials compiled from a variety of disciplines, study results that claimed a “very large effect” rarely held up when other research teams tried to replicate them.

The report should remind patients, physicians and policymakers not to give too much credence to small, early studies that show huge treatment effects, Ioannidis said.

The Stanford professor chose to publish this paper in a closed science publication. But previously he published openly on: Why Most Published Research Findings Are False.

Related: Majority of Clinical Trials Don’t Provide Meaningful EvidenceStatistical Errors in Medical StudiesMistakes in Experimental Design and InterpretationHow to Deal with False Research Findings

Today, Most Deaths Caused by Lifetime of Action or Inaction

Chart of the Leading Causes of Death in 1900 and 2010

Our instincts lead us to fear the unknown and immediate threats (probably so we can be ready to run – or maybe fight). But today the biggest risks to an untimely dealt are not lions, other people out to get us, or even just random infection. We have to adapt to the new risks by taking action to eat healthfully and exercise, in the same way we we have evolved to avoid becoming a meal for a hungry beast.

Today the largest causes of death are heart disease and cancer (which account for more than 60% of the deaths causes by the top 10 leading causes of death). The next leading causes are non-infectious airways diseases, cerebrovascular diseases and accidents. Alzheimer’s, diabetes, nephropathies, pneumonia or influenza and suicide make of the rest of the top 10 leading causes.

In 1900 Pneumonia or influenza and tuberculosis took as many lives (per 100,000 people) and cancer and heart disease take today. We have done well decreasing the incidents of death (fewer deaths per 100,000) by greatly reducing and nearly eliminating some causes of death (the 2 leading causes from 1900 are good examples).

Continue reading

Majority of Clinical Trials Don’t Provide Meaningful Evidence

The largest comprehensive analysis of ClinicalTrials.gov finds that clinical trials are falling short of producing high-quality evidence needed to guide medical decision-making.

The analysis, published today in the Journal of the American Medical Association, found the majority of clinical trials is small, and there are significant differences among methodical approaches, including randomizing, blinding and the use of data monitoring committees.

This is a critical issue as medical studies continue to leave quite a bit to be desired. Even more importantly the failure to systemically study and share evidence of effectiveness once treatments are authorized leaves a great deal to be desired. On top of leaving quite a bit to be desired, the consequences are serious. If we make mistakes for example in how we date fossils it matters but it is unlikely to cause people their lives or health. Failure to adequately manage and analyze health care experiments may very well cost people their health or lives.

“Our analysis raises questions about the best methods for generating evidence, as well as the capacity of the clinical trials enterprise to supply sufficient amounts of high quality evidence to ensure confidence in guideline recommendations,” said Robert Califf, MD, first author of the paper, vice chancellor for clinical research at Duke University Medical Center, and director of the Duke Translational Medicine Institute.

The analysis was conducted by the Clinical Trials Transformation Initiative (CTTI), a public-private partnership founded by the Food and Drug Administration (FDA) and Duke. It extends the usability of the data in ClinicalTrials.gov for research by placing the data through September 27, 2010 into a database structured to facilitate aggregate analysis.

Related: Statistical Errors in Medical StudiesHow to Deal with False Research FindingsMedical Study Integrity (or Lack Thereof)

Continue reading

Numeracy: The Educational Gift That Keeps on Giving

I like numbers. I always have. This is just luck, I think. I see, how helpful it is to have a good understanding of numbers. Failing to develop a facility with numbers results in many bad decisions, it seems to me.

A new article published in closed anti-science way, sadly (so no link), examines how people who are numerate (like literate but for number—understand) process information differently so that they ultimately make more informed decisions. Cancer risks. Investment alternatives. Calories. Numbers are everywhere in daily life, and they figure into all sorts of decisions.

People who are numerate are more comfortable thinking about numbers and are less influenced by other information, says Ellen Peters of Ohio State University (sadly Ohio State allows research by staff paid by them to be unavailable to the public – sad), the author of the new paper. For example, in one of Peters’s studies, students were asked to rate undergraduates who received what looked like different test scores. Numerate people were more likely to see a person who got 74% correct and a person who got 26% incorrect as equivalent, while people who were less numerate thought people were doing better if their score was given in terms of a percent correct.

People make decisions based on this sort of information all the time. For example, “A lot of people take medications,” Peters says. Every drug has benefits and potential risks, and those can be presented in different ways. “You can talk about the 10 percent of the population that gets the side effect or the 90 percent that does not.” How you talk about it will influence how dangerous the drug seems to be, particularly among people who are less numerate.

Other research has shown that only less numerate people respond differently to something that has a 1 in 100 chance of happening than something that has a 1 percent chance of happening. The less numerate see more risk in the 1 in 100 chance—even though these numbers are exactly the same.

“In general, people who are numerate are better able to bring consistent meaning to numbers and to make better decisions,” Peters says. “It suggests that courses in math and statistics may be the educational gift that keeps on giving.”

Related: full press releaseBigger Impact: 15 to 18 mpg or 50 to 100 mpg?Data Doesn’t Lie, But People Can be FooledUnderstanding Data: Simpson’s Paradoxapplied statistics is not about proving a theorem, it’s about being curious about thingsEncouraging Curiosity in KidsDangers of Forgetting the Proxy Nature of DataCompounding is the Most Powerful Force in the Universe

Google Prediction API

This looks very cool.

The Prediction API enables access to Google’s machine learning algorithms to analyze your historic data and predict likely future outcomes. Upload your data to Google Storage for Developers, then use the Prediction API to make real-time decisions in your applications. The Prediction API implements supervised learning algorithms as a RESTful web service to let you leverage patterns in your data, providing more relevant information to your users. Run your predictions on Google’s infrastructure and scale effortlessly as your data grows in size and complexity.

Accessible from many platforms: Google App Engine, Apps Script (Google Spreadsheets), web & desktop apps, and command line.

The Prediction API supports CSV formatted training data, up to 100M in size. Numeric or unstructured text can be sent as input features, and discrete categories (up to a few hundred different ones) can be provided as output labels.

Uses:
Language identification
Customer sentiment analysis
Product recommendations & upsell opportunities
Diagnostics
Document and email classification

Related: The Second 5,000 Days of the WebRobot Independently Applies the Scientific MethodControlled Experiments for Software SolutionsStatistical Learning as the Ultimate Agile Development Tool by Peter Norvig