(2017-10-27) The Coming Software Apocalypse

The Coming Software Apocalypse

There were six hours during the night of April 10, 2014, when the entire population of Washington State had no 911 service.

The 911 outage, at the time the largest ever reported, was traced to software running on a server in Englewood, Colorado.

Not long ago, emergency calls were handled locally. Outages were small and easily diagnosed and fixed. The rise of cellphones and the promise of new capabilities—what if you could text 911? or send videos to the dispatcher?—drove the development of a more complex system that relied on the internet. For the first time, there could be such a thing as a national 911 outage. There have now been four in as many years.

When we had electromechanical systems, we used to be able to test them exhaustively,” says Nancy Leveson,

The problem,” Leveson wrote in a book, “is that we are attempting to build systems that are beyond our ability to intellectually manage.”

Our standard framework for thinking about engineering failures—reflected, for instance, in regulations for medical devices—was developed shortly after World War II, before the advent of software, for electromechanical systems

But software doesn’t break.

The attempts now underway to change how we make software all seem to start with the same premise: Code is too hard to think about. Before trying to understand the attempts themselves, then, it’s worth understanding why this might be: what it is about code that makes it so foreign to the mind, and so unlike anything that came before it.

Software has enabled us to make the most intricate machines that have ever existed.

“The problem is that software engineers don’t understand the problem they’re trying to solve, and don’t care to,” says Leveson, the MIT software-safety expert. The reason is that they’re too wrapped up in getting their code to work.

In September 2007, Jean Bookout was driving on the highway with her best friend in a Toyota Camry when the accelerator seemed to get stuck

Toyota blamed the incidents on poorly designed floor mats, “sticky” pedals, and driver error, but outsiders suspected that faulty software might be responsible.

Using the same model as the Camry involved in the accident, Barr’s team demonstrated that there were more than 10 million ways for key tasks on the onboard computer to fail, potentially leading to unintended acceleration

Visual Studio is one of the single largest pieces of software in the world,” he said. “It’s over 55 million lines of code. And one of the things that I found out in this study is more than 98 percent of it is completely irrelevant. All this work had been put into this thing, but it missed the fundamental problems that people faced. And the biggest one that I took away from it was that basically people are playing computer inside their head.” Programmers were like chess players trying to play with a blindfold on—so much of their mental energy is spent just trying to picture where the pieces are that there’s hardly any left over to think about the game itself.

John Resig had been noticing the same thing among his students. Resig is a celebrated programmer of JavaScript—software he wrote powers over half of all websites—and a tech lead at the online-education site Khan Academy.

They had both just seen the same remarkable talk, given to a group of software-engineering students in a Montreal hotel by a computer researcher named Bret Victor. The talk, which went viral when it was posted online in February 2012, seemed to be making two bold claims. The first was that the way we make software is fundamentally broken. The second was that Victor knew how to fix it.

Though he runs a lab that studies the future of computing, he seems less interested in technology per se than in the minds of the people who use it.

work at a company that develops music synthesizers. It was a problem perfectly matched to his dual personality: He could spend as much time thinking about the way a performer makes music with a keyboard—the way it becomes an extension of their hands—as he could thinking about the mathematics of digital signal processing.

By the time he gave the talk that made his name, the one that Resig and Granger saw in early 2012, Victor had finally landed upon the principle that seemed to thread through all of his work. (He actually called the talk “Inventing on Principle.”) The principle was this: “Creators need an immediate connection to what they’re creating.” The problem with programming was that it violated the principle

There was precedent enough to suggest that this wasn’t a crazy idea. Photoshop, for instance, puts powerful image-processing algorithms in the hands of people who might not even know what an algorithm is. It’s a complicated piece of software, but complicated in the way a good synth is complicated, with knobs and buttons and sliders that the user learns to play like an instrument.

His demos were virtuosic. The one that captured everyone’s imagination was, ironically enough, the one that on its face was the most trivial. It showed a split screen with a game that looked like Mario on one side and the code that controlled it on the other

When John Resig saw the “Inventing on Principle” talk, he scrapped his plans for the Khan Academy programming curriculum. He wanted the site’s programming exercises to work just like Victor’s demos.

Khan Academy has become perhaps the largest computer-programming class in the world, with a million students, on average, actively using the program each month.

Chris Granger, who had worked at Microsoft on Visual Studio, was likewise inspired.

In April of 2012, he sought funding for Light Table on Kickstarter

But seeing the impact that his talk ended up having, Bret Victor was disillusioned. “A lot of those things seemed like misinterpretations of what I was saying...”

“Everyone thought I was interested in programming environments,” he said. Really he was interested in how people see and understand systems—as he puts it, in the “visual representation of dynamic behavior.” Although code had increasingly become the tool of choice for creating dynamic behavior, it remained one of the worst tools for understanding it

In a pair of later talks, “Stop Drawing Dead Fish” and “Drawing Dynamic Visualizations,” Victor went one further. He demoed two programs he’d built—the first for animators, the second for scientists trying to visualize their data—each of which took a process that used to involve writing lots of custom code and reduced it to playing around in a WYSIWYG interface

“I’m not sure that programming has to exist at all,” he told me. “Or at least software developers.” In his mind, a software developer’s proper role was to create tools that removed the need for software developers.

they should focus on scientists and engineers—as he put it to me, “these people that are doing work that actually matters, and critically matters, and using really, really bad tools.” Exciting work of this sort, in particular a class of tools for “model-based design,” was already underway, he wrote, and had been for years, but most programmers knew nothing about it.

Bantégnie’s company is one of the pioneers in the industrial use of model-based design, in which you no longer write code directly. Instead, you create a kind of flowchart that describes the rules your program should follow (the “model”), and the computer generates code for you based on those rules.

Of course, for this approach to succeed, much of the work has to be done well before the project even begins. Someone first has to build a tool for developing models that are natural for people—that feel just like the notes and drawings they’d make on their own—while still being unambiguous enough for a computer to understand

Esterel Technologies, which was acquired by ANSYS in 2012, grew out of research begun in the 1980s by the French nuclear and aerospace industries

Coordinating these systems to fly the plane as data poured in from sensors and as pilots entered commands required a symphony of perfectly timed reactions. “The handling of these hundreds of and even thousands of possible events in the right order, at the right time,” Ledinot says, “was diagnosed as the main cause of the bug inflation.”

He began collaborating with Gerard Berry, a computer scientist at INRIA, the French computing-research center, on a tool called Esterel—a portmanteau of the French for “real-time.”

Ledinot and Berry worked for nearly 10 years to get Esterel to the point where it could be used in production.

The FAA is fanatical about software safety. The agency mandates that every requirement for a piece of safety-critical software be traceable to the lines of code that implement it, and vice versa.

before they used model-based design, on a two-year-long project only two to three months was spent writing code—the rest was spent working on the documentation

In 2011, Chris Newcombe had been working at Amazon for almost seven years

in the appendix of a paper he’d been reading, he came across a strange mixture of math and code—or what looked like code—that described an algorithm in something called “TLA+.”

TLA+, which stands for “Temporal Logic of Actions,” is similar in spirit to model-based design: It’s a language for writing down the requirements—TLA+ calls them “specifications”—of computer programs

The language was invented by Leslie Lamport, a Turing Award–winning computer scientist. With a big white beard and scruffy white hair, and kind eyes behind large glasses, Lamport looks like he might be one of the friendlier professors at the American Hogwarts. Now at Microsoft Research, he is known as one of the pioneers of the theory of “distributed systems,”

Newcombe and his colleagues at Amazon would go on to use TLA+ to find subtle, critical bugs in major systems, including bugs in the core algorithms behind S3

Most programmers aren’t very fluent in the kind of math—logic and set theory, mostly—that you need to work with TLA+.

. “In the 15th century,” he said, “people used to build cathedrals without knowing calculus, and nowadays I don’t think you’d allow anyone to build a cathedral without knowing calculus

The real problem in getting people to use TLA+, he said, was convincing them it wouldn’t be a waste of their time. Programmers, as a species, are relentlessly pragmatic. Tools like TLA+ reek of the ivory tower.

Newcombe isn’t so sure that it’s the programmer who is to blame.

when he was introducing colleagues at Amazon to TLA+ he would avoid telling them what it stood for

he presented TLA+ as a new kind of “pseudocode,”

“Engineers think in terms of debugging rather than ‘verification,’” he wrote, so he titled his internal talk on the subject to fellow Amazon engineers “Debugging Designs.”

He has since left Amazon for Oracle.

In the summer of 2015, a pair of American security researchers, Charlie Miller and Chris Valasek, convinced that car manufacturers weren’t taking software flaws seriously enough, demonstrated that a 2014 Jeep Cherokee could be remotely controlled by hackers

“We need to think about software differently,” Valasek told me. Car companies have long assembled their final product from parts made by hundreds of different suppliers. But where those parts were once purely mechanical, they now, as often as not, come with millions of lines of code

“I think the autonomous car (Self-Driving Car) might push them,” Ledinot told me—“ISO 26262 and the autonomous car might slowly push them to adopt this kind of approach on critical parts.” (ISO 26262 is a safety standard for cars published in 2011.)


Edited:    |       |    Search Twitter for discussion