If you are looking for information about my upcoming Big Data course, which starts on April 2, 2018, it is in a different blog. Please go here to learn about the textbooks, and to see how the course worked last year.
Here is an argument for allowing companies to maintain a lot of secrecy about how their data mining (AI) models work. The claime is that revealing information will put companies at a competitive disadvantage. Sorry, that is not enough of a reason. And it’s not actually true, as far as I can tell.
The first consideration when discussing transparency in AI should be data, the fuel that powers the algorithms. Because data is the foundation for all AI, it is valid to want to know where the data…
Here is my response.
Your questions are good ones. But you seem to think that explainability cannot be achieved except by giving away all the work that led to the AI system. That is a straw man. Take deep systems, for example. The IP includes:
1) The training set of data
2) The core architecture of the network (number of layers etc)
3) The training procedures over time, including all the testing and tuning that went on.
4) The resulting system (weights, filters, transforms, etc).
5) HIgher-level “explanations,” whatever those may be. (For me, these might be a reduced-form model that is approximately linear, and can be interpreted.)
Revealing even #4 would be somewhat useful to competitors, but not decisive. The original developers will be able to update and refine their model, while people with only #4 will not. The same for any of the other elements.
I suspect the main fear about revealing this, at least among for-profit companies, is that it opens them up to second-guessing . For example, what do you want to bet that the systems now being used to suggest recidivism have bugs? Someone with enough expertise and $ might be able to make intelligent guesses about bugs, although I don’t see how they could prove them.
Sure, such criticism would make companies more cautious, and cost them money. And big companies might be better able to hide behind layers of lawyers and obfuscation. But those hypothetical problems are quite a distance in the future. Society deserves to, and should, do more to figure out where these systems have problems. Let’s allow some experiments, and even some different laws in different jurisdictions, to go forward for a few years. To prevent this is just trusting the self-appointed experts to do what is in everyone else’s best interests. We know that works poorly!
It sounds like what we used to call a “bug” to me. I guess bugs are now promoted to “algorithm failures”.
Nearly half a million elderly women in the United Kingdom missed mammography exams because of a scheduling error caused by one incorrect computer algorithm, and several hundred of those women may have died early as a result. Last week, the U.K. Health Minister Jeremy Hunt announced that an independent inquiry had been launched to determine how a “computer algorithm failure” stretching back to 2009 caused some 450,000 patients in England between the ages of 68 to 71 to not be invited for their final breast cancer screenings.
The errant algorithm was in the National Health System’s (NHS) breast cancer screening scheduling software, and remained undiscovered for nine years.
“Tragically, there are likely to be some people in this group who would have been alive today if the failure had not happened,” Hunt went on to tell Parliament. He added that based on statistical modeling, the number who may have died prematurely as a result was estimated to be between 135 and 270 women.
There is a lot of concern about AI potentially causing massive unemployment. The question of whether “this time will be different” is still open. But another insidious effect is gaining speed: putting tools in the hands of large companies that make it more expensive and more oppressive to run into financial trouble. In essence, harder to live on the edges of “The System.”
- Cars with even one late payment can be spotted, and repossessed, faster. “Business has more than doubled since 2014….” This is during a period of ostensible economic growth.
- “Even with the rising deployment of remote engine cutoffs and GPS locators in cars, repo agencies remain dominant. … Agents are finding repos they never would have a few years ago.”
- “So much of America is just a heartbeat away from a repossession — even good people, decent people who aren’t deadbeats,” said Patrick Altes, a veteran agent in Daytona Beach, Fla. “It seems like a different environment than it’s ever been.”
- “The company’s goal is to capture every plate in Ohio and use that information to reveal patterns. A plate shot outside an apartment at 5 a.m. tells you that’s probably where the driver spends the night, no matter their listed home address. So when a repo order comes in for a car, the agent already knows where to look.”
- Source: The surprising return of the repo man – The Washington Post
A nice graphical illustration of what happened when NYC subway rules were changed in seemingly small ways. The time/distance buffers that used to exist between consecutive trains shrank, to the point that a small “blip” causes cascading effects in subsequent trains. TOM once more. (Thanks to Arpita Verghese.)
My friend at NYU, Prof. Melissa Schilling, (thanks, Oscar) and I have a running debate about Tesla. She emphasizes how smart and genuinely innovative Musk is. I emphasize how he seems to treat Tesla like another R&D driven company – but it is making a very different product. Melissa is quoted in this article:
Case in point: Tesla sent workers home, with no pay, for the production shutdown last week. My discussion is after the break.
During the pause, workers can choose to use vacation days or stay home without pay. This is the second such temporary shutdown in three months for a vehicle that’s already significantly behind schedule.
By now, Tesla’s manufacturing problems are completely predictable. See my explanation, after the break. At least Wall St. is starting to catch on.
Also in this article: Tesla’s gigafactory for batteries has very similar problems. That surprises me; I thought they had competent allies helping with batteries.
But one engineer who works there cautioned that the automated lines still can’t run at full capacity. “There’s no redundancy, so when one thing goes wrong, everything shuts down. And what’s really concerning are the quality issues.”
My friend Don Norman wrote an op-ed this weekend calling for an FDA-like testing program before autonomous cars are put on the roads in the US. Clearly, some level of government approval is important. But I see lots of problems with using drug testing (FDA = Food and Drug Administration) as a model.
Here is an excerpt from a recent article about testing problems with Uber cars, which were the ones in the recent fatal accident. After the break, my assessment of how to test such cars before they are allowed on American roads.
Waymo, formerly the self-driving car project of Google, said that in tests on roads in California last year, its cars went an average of nearly 5,600 miles before the driver had to take control from the computer to steer out of trouble. As of March, Uber was struggling to meet its target of 13 miles per “intervention” in Arizona, according to 100 pages of company documents obtained by The New York Times and two people familiar with the company’s operations in the Phoenix area but not permitted to speak publicly about it.Yet Uber’s test drivers were being asked to do more — going on solo runs when they had worked in pairs.And there also was pressure to live up to a goal to offer a driverless car service by the end of the year and to impress top executives.
So Uber car performance was more than 100 times worse than Waymo cars?!