Controlled Experiments for Software Solutions

Posted on August 17, 2009  Comments (3)

by Justin Hunter

Jeff Fry linked to a great webcast in Controlled Experiments To Test For Bugs In Our Mental Models.

I firmly believe that applied statistics-based experiments are under-appreciated by businesses (and, for that matter, business schools). Few people who understand them are as articulate and concise as Kohavi. Admittedly, I could be accused of being biased as: (a) I am the son of a prominent applied statistician and (b) I am the founder of a software testing tools company that uses applied statistics-based methods and algorithms to make our tool work.

Summary of the webcast, on Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO – a presentation by Ron Kohavi with Microsoft Research.

1:00 Amazon: in 2000, Greg Linden wanted to add recommendations in shopping cards during the check out process. The “HiPPO” (meaning the Highest Paid Person’s Opinion) was against it on the grounds that it would be a bad idea; recommendations would confuse and/or distract people. Amazon, a company with a good culture of experimentation, decided to run a small experiment anyway, “just to get the data” – It was wildly successful and is in widespread use today at Amazon and other firms.

3:00 Dr. Footcare example: Including a coupon code above the total price to be paid had a dramatic impact on abandonment rates.

4:00 “Was this answer useful?” Dramatic differences occur when Y/N is replaced with 5 Stars and whether an empty text box is initially shown with either (or whether it is triggered only after a user clicks to give their initial response)

6:00 Sewing machines: experimenting with a sales promotion strategy led to extremely counter-intuitive pricing choice

7:00 “We are really, really bad at understanding what is going to work with customers…”

7:30 “DATA TRUMPS INTUITION” {especially on novel ideas}. Get valuable data through quick, cheap experimentation. “The less the data, the stronger the opinions.”

8:00 Overall Evaluation Criteria: “OEC” What will you measure? What are you trying to optimize? (Optimizing for the “customer lifetime value”)

9:00 Analyzing data / looking under the hood is often useful to get meaningful answers as to what really happened and why

10:30 A/B tests are good; more sophisticated multi-variate testing methods are often better

12:00 Some problems: OEC is hard culturally. People won’t agree. If there are 10 changes per page, you will need to break things down into smaller experiments.

14:00 Many people are afraid of multiple experiments [e.g., multi-variate experiments] more than they should be.

16:00 People do a very bad job at understanding natural variation and are often too quick to jump to conclusions.

17:00 eBay does A/B testing and makes the control group ~1%. Ron Kohavi, the presenter, suggests starting small then quickly ramping up to 50/50 (e.g., 50% of viewers will see version A, 50% will see version B).

19:00 Beware of launching experiments than “do not hurt,” there are feature maintenance cost

20:00 Drive to a data-driven culture. “It makes a huge difference. People who have worked in a data-driven culture really, really love it… At Amazon… we built an optimization system that replaced all the debates that used to happen on Fridays about what gets on the home page with something that is automated.”

21:00 Microsoft will be releasing its controlled experiments on the web platform at some point in the future, probably not in the next year

21:00 Summary
Listen to your customers because our intuition at assessing new ideas is poor
Don’t let the HiPPO drive decisions; they are likely to be wrong/ let the customer data do it
Experiment often create a trustworthy system to accelerate innovation

Justin Hunter
Founder and CEO
Hexawise
More coverage. Fewer tests.

Related: Statistics for Experimentersarticles on design of experimentsDesigned ExperimentsProductivity Gains in Software Engineering

3 Responses to “Controlled Experiments for Software Solutions”

  1. Curious Cat Science and Engineering Blog » Statistics Insights for Scientists and Engineers
    December 5th, 2009 @ 2:44 pm

    To me the key trait for applied statistics is to help experimenters learn quickly: it is an aid in the discovery process…

  2. Kieron
    May 12th, 2011 @ 5:01 am

    Found this a wee late but you could not be more right especially: “We are really, really bad at understanding what is going to work with customers…”

  3. Susan Myers
    May 12th, 2011 @ 4:51 pm

    I am a great fan of your blog, Admittedly, I could be accused of being biased as: (a) I am the son of a prominent applied statistician and (b) I am the founder of a software testing tools company that uses applied statistics-based methods and algorithms to make our tool work you should a bit of spin to it

Leave a Reply