Rescuing a medical treatment from failure in a clinical trial by using  Post Hoc Bayesian Analysis 

How can researchers maximize learning from experiments, especially from very expensive experiments such as clinical trials? This article shows how a Bayesian analysis of the data would have been much more informative, and likely would have saved a useful new technique for dealing with ARDS.

I am a big supporter of Bayesian methods, which will become even more important/useful with machine learning. But a colleague, Dr. Nick Eubank, pointed out that the data could also have been re-analyzed using frequentist statistics. The problem with the original analysis was not primarily that they used frequentist statistics. Rather, it was that they set a fixed (and rather large) threshold for defining success. This threshold was probably unattainable. But the clinical trial could still have been “saved,” even by conventional statistics.

Source: Extracorporeal Membrane Oxygenation for Severe Acute Respiratory Distress Syndrome and Posterior Probability of Mortality Benefit in a Post Hoc Bayesian Analysis of a Randomized Clinical Trial. | Critical Care Medicine | JAMA | JAMA Network

Here is a draft of a letter to the editor on this subject. Apologies for the very academic tone – that’s what we do for academic journals!

The study analyzed in their article was shut down prematurely due to the unlikelihood that it would attain the target level of performance. Their paper shows that this might have been avoided, and the technique shown to have benefit, if their analysis had been performed before terminating the trial. A related analysis could usefully have been done within the frequentist statistical framework. According to their Table 2, a frequentist analysis (equivalent to an uninformative prior) would have suggested a 96% chance that the treatment was beneficial, and an 85% chance that it had RR < .9 .

The reason the original study appeared to be failing was not solely that it was analyzed with frequentist methods. It also failed because the target threshold for “success” was set at a high threshold, namely RR < .67. Thus, although the full Bayesian analysis of the article was more informative, even frequentist statistics can be useful to investigate the implications of different definitions of success.

Credit for this observation goes to Nick. I will ask him for permission to include one of his emails to me on this subject.

Some of my harder to find papers

Accelerated Learning by Experimentation

Abstract

In most technologies and most industries, experiments play a central role in organizational learning as a source of knowledge and as a check before changes are implemented. There are four primary types of experiments: controlled, natural, ad-hoc, and evolutionary operation. This paper discusses factors that affect learning by experimentation and how they influence learning rates. In some cases, new ways of experimenting can create an order of magnitude improvement in the rate of learning. On the other hand, some situations are inherently hard to run experiments on, and therefore learning remains slow until basic obstacles are solved. Examples of experimentation are discussed in four domains: product development, manufacturing, consumer marketing, and medical trials.

Keywords: Learning, Experimentation

Full published version:    Bohn Accelerated Learning by Experimentation.    in Learning Curves: Theory, Models, and Applications  edited by Mohamad Y. Jaber, CRC Press, 2011.
Preprint version, through SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1640767.

This is a rewritten version of my 1987 paper about manufacturing. Although I never published that paper, it was very influential in the 1990s, including becoming the (plagiarized) framework of a book and several articles by Stefan Thomke.  This book chapter expands and extends the original concepts, including new material from Michael Lapré.

One Day, a Machine Will Smell Whether You’re Sick – The New York Times

Sniffing disease markers is a fundamentally promising concept. We know that dogs have very good smell, so that is an existence proof that something interesting can be detected in the air. (In my family’s experience, human smell can also become amazingly good, at least for pregnant women!) In fact, if B.F. Skinner were still alive, I wonder if he would be training pigeons to sniff out disease?

But although air is feasible, it does seem like blood is a better choice because it is likely to have stronger signals and lower noise. Air-based sensors would be non-invasive, so perhaps that is why some groups are pursuing air.

…a team of researchers from the ..Monell Chemical Senses Center and the University of Pennsylvania [are working] on a prototype odor sensor that detects ovarian cancer in samples of blood plasma.

The team chose plasma because it is somewhat less likely than breath or urine to be corrupted by confounding factors like diet or environmental chemicals, including cleaning products or pollution. Instead of ligands, their sensors rely on snippets of single-strand DNA to do the work of latching onto odor particles.

“We are trying to make the device work the way we understand mammalian olfaction works,” … “DNA gives unique characteristics for this process.”

Judging by research at UCSD and elsewhere, I envision tests like this eventually be run as add-on modules to smartphones. Buy a module for $100 (single molecule, home use) up to $5000 (multiple molecules, ambulance use), and plug it into your phone. Above $5000, you will probably use a dedicated electronics package. But that package might be based on Android OS.

This is also another example of Big Data science. It could be done before, but it will be a lot easier now. Blood collected for other purposes from “known sick” patients could be used to create a 50,000 person training set. (The biggest problem might be getting informed consent.)

 

Art to Science: in making whiskey barrels

Whiskey is aged in oak barrels, and oak wood is highly variable. But barrel-making can still become much more scientific.

“Twenty-five years ago, it was more art than science. Now we have a healthy dose of science in with the art.” Larry Combs, the general manager for Jack Daniel’s

Recently, the two companies completed the decade-long Single Oak Project, in which they made 192 barrels, each using the wood from a single log, to find what constituted the “perfect” bourbon. (Among other things, they found that wood from the bottom of a tree made for the best aging.). Computers track each stave as it moves through assembly, while sensors analyze staves for density and moisture content. Instead of guessing how much to toast a barrel, operators use lasers and infrared cameras to monitor the temperature of the wood and the precise chemical signature that the heat coaxes to the surface — all subject to the customer’s desired flavor profile.“They’ve developed technologies so that if we say we want coconut flavors, they can apply this or that process” — like applying precise amounts of heat to different parts of the wood to tease out certain flavors — “and we’ll have it,” said Charles de Pottere, the director of production and planning at Jackson Family Wines…

… Black Swan makes barrels with a honeycomb design etched on the inside, which increases surface area and reduces a whiskey’s aging time.

Their approach: learn by experimentation, and use the new knowledge for tight process control. Same approach as machining, aviation, …. And this is a 400+ year old industry. Now I just need a word that’s better than “science” to describe this approach. (See my previous post.)

Last comment: according to the article, one of the main forces driving willingness to learn was competition from superior French barrels.

Source: Packing Technology Into the Timeless Barrel – The New York Times

Drones Hunt Down Poachers in South Africa | Flying Magazine

The Lindbergh Foundation’s Air Shepherd initiative uses drones to catch poachers in South Africa.

My comment: Flying at night, up to 40km away, is technically difficult. But smart autopilots, using GPS and accelerometers, mean that the operators (pilots) don’t have to do hands-on flying except landing and takeoff.  Probably every component in the system except the ground vehicles is hobbyist level, although some of the specialized long-range radio gear might need to be hand built.  Nothing from aerospace companies.  Battery powered, so essentially noiseless. Also, the aircraft itself is the cheapest part of the system.

The article mentions flights of “up to 4 hours.” That is a very long duration, and would require lots of batteries. 2 hours or even less sounds more realistic. Efficient cruising speed is probably is probably around 40 kph (25 mph). If anyone finds other discussions of this project, please let me know.

Source: Drones Hunt Down Poachers in South Africa | Flying Magazine

Self-driving cars may take decades to prove safety: Not so.

Proving self-driving cars are safe could take up to hundreds of years under the current testing regime, a new Rand Corporation study claims. Source: Self-driving cars may not be proven safe for decades: report  The statistical analysis in this paper looks fine, but the problem is even worse for aircraft (since they are far safer per mile than autos.) Yet new aircraft are sold after approx 3 years of testing, and less than 1 million miles flown. How?

From the report:

we will show that fully autonomous vehicles would have to be driven hundreds of millions of miles and sometimes hundreds of billions of miles to demonstrate their reliability in terms of fatalities and injuries. Under even aggressive testing assumptions, existing  fleets would take tens and sometimes hundreds of years to drive these miles.

How does the airline industry get around the analogous statistics? By understanding how aircraft fail, and designing/testing for those specific issues, with carefully calculated specification limits. They don’t just fly around, waiting for the autopilot to fail!

Continue reading