Home » Security Bloggers Network » Software quality: It can be a matter of life and death

Software quality: It can be a matter of life and death

by Taylor Armerding on September 3, 2019

Safety-critical software powers everything from airplanes to power plants, defib machines, and seatbelts. And quality issues can lead to injury and death.

Was there ever any doubt that defective software could kill people—lots of people? If there was, two catastrophic Boeing 737 MAX 8 jetliner crashes erased it.

But so far, there is scant evidence that these tragedies have revolutionized software development. That’s a tragedy itself, given the availability of methods and tools to eliminate defects in safety-critical software.

Glitch in safety-critical software

It has been 10 months since a Lion Air 737 MAX 8 crashed into the Java Sea off Indonesia, killing all 189 passengers and crew, due to what investigators described as a “glitch” in the plane’s flight-control software.

A deadly glitch. Following the crash, the Federal Aviation Administration (FAA) issued an emergency notice to operators of Boeing 737 MAX 8 and 9 planes, warning that faulty angle of attack sensor readings “could cause the flight crew to have difficulty controlling the airplane.” This loss of control, the FAA said, in a euphemistic understatement, could lead to “possible impact with terrain.” Or in this case, the ocean.

And then it happened again, a bit more than four months later, when an Ethiopian Airlines 737 MAX 8 went down under similar circumstances, killing all 157 people aboard.

That incident prompted the grounding of all 737 MAX 8 jetliners worldwide. They remain grounded today, since U.S. regulators found another software flaw in late June. Word from the FAA is that the planes may not be flying again until December or perhaps even into 2020.

All because of defects in safety-critical software.

Granted, as the efforts to fix the “glitch” have dragged on, inevitable media leaks have suggested that multiple factors are to blame. They include an alleged lack of pilot training, a failure to include instructions for the flight-control system known as MCAS (Maneuvering Characteristics Augmentation System) in the flight manual, and the outsourcing of software development to low-paid workers from India.

Still, Boeing essentially acknowledged that software was the primary problem when an unnamed official told CNN Business, “We believe this can be updated through a software fix.”

How to build better safety-critical software

The situation is a stark reminder that modern society is increasingly dependent on the quality and security of software. We look to software not only to provide convenience but also to protect our lives and ensure our safety.

Organizations can ensure the quality and security of safety-critical software in a variety of ways. But they’re all related to the concept of building software integrity in during development. Many organizations rely on the alternative: bolting or patching it on at the end. But in the long run, “shifting left” makes it cheaper and faster to build a superior product.

Shifting left requires the use of multiple testing and analysis tools throughout the SDLC. Among them:

Architecture risk analysis during design
Static analysis during development
Interactive static application testing during testing/QA
Dynamic analysis pre-release

Another essential tool is software composition analysis, which helps developers uncover known bugs and vulnerabilities in open source components.

A similar message came from Eric Elliott, an author and distributed systems expert, in a post on Medium. Securing a “life-critical system,” he said, has to include “many lines of defense in order to assure quality control.”

Among the lines of defense he listed: requirement specification and review, risk analysis, test-driven development, static analysis, and software inspection/code review.

Slow is fast in software development

None of this is easy. Sammy Migues, principal scientist at Synopsys, said shortly after the Lion Air crash that before automation, “we had a pretty short list of things that could go wrong—metallurgy, engine, construction, etc.”

“Now there are a million things that could go wrong, from software, software integration, software errors, software interfaces, unexpected conditions that software has to deal with, and so on. There are way more situations that can adversely impact passenger safety,” he said.

In any case, the result of building quality and security into safety-critical software is a better product. And a “better” product is less likely to be dangerously flawed or to threaten life and safety.

Elliott and numerous other experts warn that there is no such thing as software that is fast, cheap, and good. It’s a dangerous, potentially lethal, fantasy. The irony is that trying to go “fast and cheap” in the development of software yields the opposite results.

“The phrase ‘slow is fast’ is well known in engineering circles,” he said. “It has origins in the military phrase ‘slow is smooth and smooth is fast,’ and it’s a form of uncommon sense.”

As in lifesaving common sense.