Parachutes are not an argument against RCTs for medical treatments!

Another doctor has recently used parachutes as an example of why some medical treatments don’t need to be tested before using them on patients. That historical claim is wrong.

Arguing for “The search for perfect evidence may be the enemy of good policy,” Greenhalgh, a physician and expert in health care delivery at the University of Oxford, fumed in the Boston Review. “As with parachutes for jumping out of airplanes, it is time to act without waiting for randomized controlled trial evidence.” [emphasis added]….

COVID-19, she argues, has revealed the limits of evidence-based medicine—masks being a potent case in point.

The United Kingdom’s mask crusader
Ellen Ruppel Shell
Science 16 Oct 2020: Vol. 370, Issue 6514, pp. 276-277
DOI: 10.1126/science.370.6514.276

A 2003 article in British Medical Journal claimed after a literature search that “No randomised controlled trials of parachute use have been undertaken,[sic]” and went on to claim that “Individuals who insist that all interventions need to be validated by a randomised controlled trial need to come down to earth with a bump.” This is nonsense. Parachutes were heavily tested by the British air force late in WW I, for example. The issue was controversial at the time because German pilots already had parachutes, and the British military was slow to adopt, perhaps because of NIH (Not Invented Here). Continued trials delayed deployment until after the war was over.

Jet ejection seats, a “super-parachute” invented in the 1940s, received comprehensive engineering tests as various designs were experimented on. Tests ultimately included multiple human trials with volunteers. Despite that, many pilots at the time were hesitant to trust them, but field experience (and lack of alternatives when you were about to crash) led to still-reluctant acceptance. The reluctance stemmed from the dangers of ejection – severe injuries were common, due to high accelerations, collisons with pieces of the aircraft, and so forth. Continued experimentation at many levels (simulations, scale models, dummy pilots, etc.) have led to many improvements over the early designs, and most pilots who eject now are not permanently injured.

Test of a 0/0 ejection by Major Jim Hall, 1965

So parachutes have been, and new designs continue to be, heavily tested. Perhaps the 2003 authors missed them because they did not search obscure engineering and in-house journals written decades before the Internet. What about the “controlled” part of Randomized Controlled Trials? They had not even been invented in 1918; R.A. Fisher’s seminal work on experimental statistics was done in the 1920s and 30s.

More important, engineering trials have something better than randomization: deliberate “corner tests.” With humans and diseases we don’t know all the variables that affect treatment effectiveness, and even if we knew them, we couldn’t measure many of them. But with engineered systems we can figure out most key variables ahead of time. So trials can be run with:

  • Low pilot weight / high pilot weight
  • Low airspeed/high airspeed
  • Low, intermediate, and high altitudes
  • Aircraft at 0 pitch and yaw, all the way to aircraft inverted.
  • Delayed or early seat ejection.
  • Testing prototypes (and now, finite element simulations) can tell us which conditions are most extreme, so not all corners need full-scale tests.

Of course some of these tests will “fail,” e.g. early ejection seats did not work at low altitude and airspeed. Those limits are then written into pilots’ manuals. That is considerably better than we do with many RCT’s, which deliberately choose trial subjects who are more healthy than patients who will ultimately take the medicine.

So let’s stop using this analogy. Parachutes were never adopted without (the equivalent of) RCTs. Thereare many reasons to adopt masks without years of testing, but this is not one of them.

(I have written more about this in my book draft about the evolution of flying from an art to a science.)

Don’t expect Level 5 Autonomous cars for decades

Why I don’t expect fully autonomous city driving in my lifetime (approx 25 years).

Paraphrase: The strange and crazy things that people do. .. a ball bouncing in front of your car, a child falling down, a car running a red light, head-down pedestrian. A level-5 car has to handle all of these cases, reliably.

These situations require 1) a giant set of learning data 2) Very rapid computing 3) Severe braking. Autonomous cars today are very slow + very cautious in order to allow more time for decisions and for braking.

My view:

There is no magic bullet that can solve these 3 problems, except keeping autonomous cars off of city streets. And all 3 get worse in bad weather, including fog much less in snow.

Also, there are lots of behavioral issues, such as “knowing” the behavior of pedestrians in different cities. Uber discovered that frequent braking/accelerating makes riders carsick – so they re-tuned their safety margins, and their car killed a pedestrian.

A counter-argument (partly from Don Norman, jnd1er): Human drivers are not good at these situations either, and occasionally hit people. Therefore, we should not wait for perfection, but instead systems that on balance are better than humans.  As distracted driving gets worse, the tradeoff in favor of autonomous cars will shift.

But there is another approach to distracted driving. Treat it like drunk driving. Make it socially and legally unacceptable. Drunk driving used to be treated like an accident, with very light penalties even in fatal accidents.

Finally, I’m not sure if any amount of real-life driving will be good enough to develop  training datasets for the rarest edge cases. Developers will need supplemental methods to handle them, including simulated accidents and some causal modeling. For example, the probabilities of different events change by location and time of day. Good drivers know this, and adjust. Perhaps cars will need adjustable parameters that shift their algorithm tuning in different circumstances.

Source of the quotation: Experts at the Table: The challenges to build a single chip to handle future autonomous functions of a vehicle span many areas across the design process.

Source: Semiconductor Engineering – Challenges To Building Level 5 Automotive Chips

Semiconductors get old, and eventually die. It’s getting worse.

I once assumed that semiconductors lasted effectively forever. But even electronic devices wear out. How do semiconductor companies plan for aging?

There has never been a really good solution, and according to this article, problems are getting worse. For example, electronics in cars continue to get more complex (and more safety critical). But cars are used in very different ways after being sold, and in very different climates.This makes it impossible to predict how fast a particular car will age.

Electromigration
Electromigration is one form of aging. Credit:  JoupYoup – Own work, CC BY-SA 4.0, 

When a device is used constantly in a heavy load model for aging, particular stress patterns exaggerate things. An Uber-like vehicle, whether fully automated or not, has a completely different use model than the standard family car that actually stays parked in a particular state a lot of the time, even though the electronics are always somewhat alive. There’s a completely different aging model and you can’t guard-band both cases correctly.

Would ‘explainable AI’ force companies to give away too much? Not really.

Here is an argument for allowing companies to maintain a lot of secrecy about how their data mining (AI) models work. The claime is that revealing information will put  companies at a competitive disadvantage. Sorry, that is not enough of a reason. And it’s not actually true, as far as I can tell.

The first consideration when discussing transparency in AI should be data, the fuel that powers the algorithms. Because data is the foundation for all AI, it is valid to want to know where the data…

Source: The problem with ‘explainable AI’ | TechCrunch

Here is my response.

Your questions are good ones. But you seem to think that explainability cannot be achieved except by giving away all the work that led to the AI system. That is a straw man. Take deep systems, for example. The IP includes:
1) The training set of data
2) The core architecture of the network (number of layers etc)
3) The training procedures over time, including all the testing and tuning that went on.
4) The resulting system (weights, filters, transforms, etc).
5) HIgher-level “explanations,” whatever those may be. (For me, these might be a reduced-form model that is approximately linear, and can be interpreted.)

Revealing even #4 would be somewhat useful to competitors, but not decisive. The original developers will be able to update and refine their model, while people with only #4 will not. The same for any of the other elements.

I suspect the main fear about revealing this, at least among for-profit companies, is that it opens them up to second-guessing . For example, what do you want to bet that the systems now being used to suggest recidivism have bugs? Someone with enough expertise and $ might be able to make intelligent guesses about bugs, although I don’t see how they could prove them.
Sure, such criticism would make companies more cautious, and cost them money. And big companies might be better able to hide behind layers of lawyers and obfuscation. But those hypothetical problems are quite a distance in the future. Society deserves to, and should, do more to figure out where these systems have problems. Let’s allow some experiments, and even some different laws in different jurisdictions, to go forward for a few years. To prevent this is just trusting the self-appointed experts to do what is in everyone else’s best interests. We know that works poorly!

Tesla employees say Gigafactory problems worse than known

By now, Tesla’s manufacturing problems are completely  predictable. See my explanation, after the break. At least Wall St. is starting to catch on.
Also in this article: Tesla’s gigafactory for batteries has very similar problems. That  surprises me; I thought they had competent allies helping with batteries.

But one engineer who works there cautioned that the automated lines still can’t run at full capacity. “There’s no redundancy, so when one thing goes wrong, everything shuts down. And what’s really concerning are the quality issues.”

Source: Tesla employees say Gigafactory problems worse than known

Continue reading

It will be very tricky to test and regulate safety of self-driving cars

My friend Don Norman wrote an op-ed this weekend calling for an FDA-like testing program before autonomous cars are put on the roads in the US. Clearly, some level of government approval is important. But I see lots of problems with using drug testing (FDA = Food and Drug Administration) as a model.

Here is an excerpt from a recent article about testing problems with Uber cars, which were the ones in the recent fatal accident. After the break, my assessment of how to test such cars before they are allowed on American roads.

Waymo, formerly the self-driving car project of Google, said that in tests on roads in California last year, its cars went an average of nearly 5,600 miles before the driver had to take control from the computer to steer out of trouble. As of March, Uber was struggling to meet its target of 13 miles per “intervention” in Arizona, according to 100 pages of company documents obtained by The New York Times and two people familiar with the company’s operations in the Phoenix area but not permitted to speak publicly about it.Yet Uber’s test drivers were being asked to do more — going on solo runs when they had worked in pairs.And there also was pressure to live up to a goal to offer a driverless car service by the end of the year and to impress top executives.

So Uber car performance was more than 100 times worse than Waymo cars?!

Continue reading

Hollywood as a model for academic research

Academia has a problem: the value, necessity, and practices of collaboration are increasing, but the system of giving credit is inadequate. In most fields, there are only 4 levels of credit:

  • None at all
  • “Our thanks to Jill for sharing her data.” (a note of thanks)
  • First Authorship (This is ambiguous: it may be alphabetical.)
  • Listed as another author

In contrast to this paucity, modern empirical paper writing has many roles. Here are a dozen roles. Not all of them are important on a single paper, but each of them is important in some papers.

  • Intellectual leadership.
    • Source of the original idea
  • Doing the writing
    • Writing various parts, e.g. literature review
    • Doing the grunt work on the stat analysis. (Writing and running the R code)
    • Doing the grunt work of finalizing for publication. (Much easier than it used to be!)
    • Dealing with revisions, exchanges with editors, etc.
  • Source of the data.
    • Funder of the data
  • Raised the funding;
    • Runs the lab where the authors are employed
    • Source of the money: usually an agency or foundation, but sometimes the contracting author is listed as a coauthor.

Continue reading