This short interview has some good explanations.
LeCun: Actually, I think the basics of machine learning are quite simple to understand….
A pattern recognition system is like a black box with a camera at one end, a green light and a red light on top, and a whole bunch of knobs on the front. The learning algorithm tries to adjust the knobs so that when, say, a dog is in front of the camera, the red light turns on, and when a car is put in front of the camera, the green light turns on. You show a dog to the machine. If the red light is bright, don’t do anything. If it’s dim, tweak the knobs so that the light gets brighter. If the green light turns on, tweak the knobs so that it gets dimmer. Then show a car, and tweak the knobs so that the red light get dimmer and the green light gets brighter. If you show many examples of the cars and dogs, and you keep adjusting the knobs just a little bit each time, eventually the machine will get the right answer every time.
Why unsupervised learning is critical in the long run, but does not yet work:
The type of learning that we use in actual Deep Learning systems is very restricted. What works in practice in Deep Learning is “supervised” learning. You show a picture to the system, and you tell it it’s a car, and it adjusts its parameters to say “car” next time around. Then you show it a chair. Then a person. And after a few million examples, and after several days or weeks of computing time, depending on the size of the system, it figures it out.
Now, humans and animals don’t learn this way. You’re not told the name of every object you look at when you’re a baby. And yet the notion of objects, the notion that the world is three-dimensional, the notion that when I put an object behind another one, the object is still there—you actually learn those. You’re not born with these concepts; you learn them. We call that type of learning “unsupervised” learning.
Facebook AI Director Yann LeCun on His Quest to Unleash Deep Learning and Make Machines Smarter – IEEE Spectrum.
When the doctor’s away, the patient is more likely to survive | Ars Technica.
Very surprising. When cardiologists are away from the hospital, deaths after heart failure or cardiac arrest declined. I’ll probably use this in my course this Spring. (Or perhaps in both courses: Big Data, and Operations Quality in Healthcare.)
No, a study did not link GM crops to 22 diseases.
And a candidate for worst graph of the year, appearing to show that deaths from a certain class of diseases grew in parallel with some farming trends. ! (Figure 16 in the article, which is at http://www.organic-systems.org/journal/92/JOS_Volume-9_Number-2_Nov_2014-Swanson-et-al.pdf ). Any steadily increasing time series can be plotted so that they lie approximately on top of each other, if you distort the scales enough. Other “causes” they could have plotted, with approximately the same results: cell-phone per capita, percentage of cars on the road with ABS brakes, and (for all I know) average campaign spending per Congressional race.
Under-reporting of clinical trials has been a problem for for decades (if not more). Only in the last few years has the medical community realized the pernicious effects this has on our knowledge about “what works” in medicine. If “bad” results don’t get permitted, all kinds of problems ensue, such as overly-optimistic views of new drugs, repeating of expensive and potentially dangerous research, and general waste of money. Since the NIH is such a big funder of medical research, this affects taxpayers too!
In any case, the NIH continues its slow (but steady?) crackdown on this issue. They are even threatening to cut off funding for researchers who don’t make their results available! (Of course a lot of research is funded by pharmaceutical companies, so this is hardly a comprehensive threat.)
I track this kind of thing because of my interest in “How societies learn” about technology. Forgetting and ignoring are powerful forces in retarding learning.
JAMA. Published online November 19, 2014. doi:10.1001/jama.2014.10716
Machine-Learning Maestro Michael Jordan on the Delusions of Big Data and Other Huge Engineering Efforts – IEEE Spectrum.
I agree 100% with the following discussion of big data learning methods, which is excerpted from an interview. Big Data is still in the ascending phase of the hype cycle, and its abilities are being way over-promised. In addition, there is a great shortage of expertise. Even people who take my course on the subject are only learning “enough to be dangerous.” It will take them months more of applied work to begin to develop reasonable instincts, and appropriate skepticism.
As we are now realizing, standard econometrics/regression analysis has many of the same problems, such as publication biases and excess re-use of data. And one can argue that it’s effects e.g. in health care have also been overblown to the point of being dangerous. (In particular, the randomized controlled trials approach to evaluating pharmaceuticals is much too optimistic about evaluating side effects. I’ve posted messages about this before.) The important difference is that now the popular press has adopted Big Data as its miracle du jour.
One result is excess credulity. On the NPR Marketplace program recently, they had a breathless story about The Weather Channel, and its ability to forecast amazing things using big data. The specific example was that certain weather conditions in Miami in January predict raspberry sales. What nonsense. How many Januaries of raspberry sales can they be basing that relationship on? 3? 10?
Why Big Data Could Be a Big Fail [this is the headline that the interviewee objected to – see below]
Spectrum: If we could turn now to the subject of big data, a theme that runs through your remarks is that there is a certain fool’s gold element to our current obsession with it. For example, you’ve predicted that society is about to experience an epidemic of false positives coming out of big-data projects.
Michael Jordan: When you have large amounts of data, your appetite for hypotheses tends to get even larger. And if it’s growing faster than the statistical strength of the data, then many of your inferences are likely to be false. They are likely to be white noise.
Spectrum: How so?
The Ph.D. Student’s Ticking Clock – Graduate Students – The Chronicle of Higher Education.
Many of my former students come back years later and ask my advice about getting a PhD. I generally tell them that a PhD program is like a monastery – you have to love the pursuit of knowledge, for its own sake, to make it bearable. If you are doing it only in pursuit of a post-graduation goal, it is too hard a life.
This article includes a startling graph on time-to-graduation. I graduated from MIT in 1982, after 4 years. According to the graph, the average time in social sciences then was 8 years?! I had a lot of breaks (NSF Fellowship, stipend from one of my thesis advisors, pregnant wife to provide emotional support and incentive!) but 4 to 5 years seemed like the norm in my program.
In any case, the second half of the article has some realistic advice about the stresses of protracted graduate programs, and about the importance of your particular advisor’s style.
This post is for students who want to take my course, Technology and Operations Management, IRGN438, but have not been able to register. Here is the syllabus. Take a careful look, and realize that it involves a considerable amount of work. If you want permission to take the course, please send me an email with: Continue reading