Under-reporting of clinical trials has been a problem for for decades (if not more). Only in the last few years has the medical community realized the pernicious effects this has on our knowledge about “what works” in medicine. If “bad” results don’t get permitted, all kinds of problems ensue, such as overly-optimistic views of new drugs, repeating of expensive and potentially dangerous research, and general waste of money. Since the NIH is such a big funder of medical research, this affects taxpayers too!
In any case, the NIH continues its slow (but steady?) crackdown on this issue. They are even threatening to cut off funding for researchers who don’t make their results available! (Of course a lot of research is funded by pharmaceutical companies, so this is hardly a comprehensive threat.)
I track this kind of thing because of my interest in “How societies learn” about technology. Forgetting and ignoring are powerful forces in retarding learning.
JAMA. Published online November 19, 2014. doi:10.1001/jama.2014.10716
Machine-Learning Maestro Michael Jordan on the Delusions of Big Data and Other Huge Engineering Efforts – IEEE Spectrum.
I agree 100% with the following discussion of big data learning methods, which is excerpted from an interview. Big Data is still in the ascending phase of the hype cycle, and its abilities are being way over-promised. In addition, there is a great shortage of expertise. Even people who take my course on the subject are only learning “enough to be dangerous.” It will take them months more of applied work to begin to develop reasonable instincts, and appropriate skepticism.
As we are now realizing, standard econometrics/regression analysis has many of the same problems, such as publication biases and excess re-use of data. And one can argue that it’s effects e.g. in health care have also been overblown to the point of being dangerous. (In particular, the randomized controlled trials approach to evaluating pharmaceuticals is much too optimistic about evaluating side effects. I’ve posted messages about this before.) The important difference is that now the popular press has adopted Big Data as its miracle du jour.
One result is excess credulity. On the NPR Marketplace program recently, they had a breathless story about The Weather Channel, and its ability to forecast amazing things using big data. The specific example was that certain weather conditions in Miami in January predict raspberry sales. What nonsense. How many Januaries of raspberry sales can they be basing that relationship on? 3? 10?
Why Big Data Could Be a Big Fail [this is the headline that the interviewee objected to – see below]
Spectrum: If we could turn now to the subject of big data, a theme that runs through your remarks is that there is a certain fool’s gold element to our current obsession with it. For example, you’ve predicted that society is about to experience an epidemic of false positives coming out of big-data projects.
Michael Jordan: When you have large amounts of data, your appetite for hypotheses tends to get even larger. And if it’s growing faster than the statistical strength of the data, then many of your inferences are likely to be false. They are likely to be white noise.
Spectrum: How so?
Time to debunk another widely covered press story about wonderful new inventions coming from a tech giant. Ars Technica had one of many articles about Google’s “announcement” of a blood glucose sensor in a contact lens. The discussion after the article is good, as often happens with Ars. Here’s my quick explanation of why the concept will fail. Unfortunately.
Non-invasive glucose testing is the perennial “pot of gold at the end of the rainbow.” Google is not the first to try using tears; the others have failed, and they will too. They say it is “5 years away,” which is equivalent to saying “We have not yet tested it on real diabetics.”
The problem is basically that tears won’t track blood glucose levels closely. Tears are secreted by the lacrimal gland. I’ve never studied it, but the composition of its secretion is sure to depend on a multitude of variables. (Think: sweat, saliva, etc.) Even if a relationship exists and can be quantified “on average,” there will be lags.
It’s possible that a device like this could supplement other measurement systems. But nothing will be as good as actual blood measurements. Therefore finger sticks will always be needed for calibration. The best realistic case is that a contact lens device could serve as an early warning; but finger sticks will still be needed for validation before taking any action.
via Google introduces smart contact lens project to measure glucose levels | Ars Technica.
This column in Scientific American from a 30-year veteran of science journalism has some good perspective on the ongoing controversy about non-replicability of so many scientific results. I wish I knew a system solution.
Discussing his findings in Scientific American two years ago, Ioannidis writes: “False positives and exaggerated results in peer-reviewed scientific studies have reached epidemic proportions in recent years. The problem is rampant in economics, the social sciences and even the natural sciences, but it is particularly egregious in biomedicine.”
A Dig Through Old Files Reminds Me Why I’m So Critical of Science — blogs.scientificamerican.com
Software, Design Defects Cripple Health-Care Website – WSJ.com.
Poor software design is still common. I notice the developer was Experian, a private company. Outsourcing the web system for the Affordable Care Act was the right idea, but looks like they picked a weak company.
It will be interesting to get a post-mortem in a year or two. I hope someone writes it up for the New Yorker. It should make a good case study on software product development.
System is down…
This is the “entry page” for my paper on the slow adoption of better flying methods in WW 2. Please link to this page, rather than to the actual PDF, which I will be updating. Here is the paper itself. (July 19 version)
In the late 1930s, US military aviators in the American Army and Navy began using aviation checklists. Checklist became part of a new paradigm for how to fly, which I call Standard Procedure Flying, colloquially known as “flying by the book.” It consisted of elaborate standardized procedures for many activities, checklists to ensure they key steps had been done, and quantitative tables and formulas that specified the best settings, under different conditions, for speed, engine RPM, gasoline/air mixture, engine cooling, and many other parameters. This new paradigm had a major influence on reducing aviation accidents and increasing military effectiveness during World War II, particularly because of the rapidly increasing complexity of military aircraft, and the huge number of new pilots. Continue reading
B-17 Throttles (Photo credit: rkbentley)
On Sunday I gave a capstone talk at the Production & Operations Society meeting in Denver. I oriented my talk toward a comparison of health care now, with aviation’s transition to Standard Procedure Flying in the 1940s and 50s. BOHN POMS Standard procedure flying 2013e
As in medicine now, experienced expert flyers who did not use standard procedures were still better than newly trained pilots who did. And there was resistance to the changes. But aviation had a couple of advantages in making the transition: New pilots who did not learn SPF died quickly, usually in accidents. And the old experts got rotated out of combat positions (United States Army Air Force), or eventually got shot down no matter how good they were. (Germany)