In Defense of HTML5

Many of the broad family of specifications commonly grouped under the “HTML5” umbrella are scheduled to be completed in 2013, and with the release of Internet Explorer 10, the users of every major web browser flavor can enjoy rich Web apps written on the open web platform, with no need for plugins. 

Lots of people are excited about HTML5, but one group I don’t see as particularly excited are security experts, or perhaps they’re only excited in a rather cynical fashion.  Full employment!  Browser botnets! A lifetime of conference talks!  And the malediction against HTML5 isn’t just coming from folks with a product to sell or a slide deck to submit – HTML5 has become a common boogeyman representing out-of-control complexity and vast attack surface for some of the very best analysts and researchers in the field.  So, although developers are racing to embrace it, CISOs, CIOs and enterprise
security decision makers as a group seem wary.

Frankly this puzzles and distresses me, because from my perspective, HTML5 is a key part – perhaps the most important part – in one of the greatest security success stories in the history of computing.  The story of the web browser over the last decade is the story of something completely unprecedented – a tremendous increase in functionality and use that happened side-by-side with a tremendous decrease in  vulnerability and attack surface.   Don’t believe me?  Let’s go back a decade…

2002 was an important time for me personally, as it’s right about the time I decided to move from being a developer interested in security to a full-time security professional.  It was Thanksgiving, and my family all gathered at my Aunt’s house.  She asked if I could take a look at her computer – it was running the latest and greatest Windows XP and was only a few months old – but it took 20 minutes to boot up and was slow as a dog.  Well, over the course of that weekend, my brother, father and I, with fifty years of computing experience between us, spent over 30 hours painstakingly removing 5000 viruses and pieces of malware from her system.  My cousins had invited some of it by downloading pirated games, but much of it was simply from browsing the web – and I was soon getting similar support requests from friends and family with decidedly vanilla browsing habits.  As I spent hours cleaning people’s computers, it was clear to me that the industry was in real trouble on the security front, and this would be a good direction to take my career.  In the decade since, I’ve been on the front lines of the Web security battle, working on everything from web apps to browsers, to operating systems and web standards.

And let me tell you, if you think HTML5 is a security disaster, you’ve utterly forgotten where we were ten years ago.  The first browser wars had mostly wound down, leaving Internet Explorer 6 with a commanding lead in market share, on both XP and the Mac.  HTML4 was the lingua franca of the Web, but the Rich Web was already here.  That’s one of the things that I think is most neglected in criticisms of HTML5 – most of the "new attack surface” isn’t actually new.  Sockets, cross-origin communications, multimedia, background processing, local storage – all the key components of the Rich Web Application were already in wide use by 2002, but they were in Flash, Java and ActiveX.  That’s the real benchmark against which we must compare HTML5. 

Some will argue this isn’t a fair comparison, (especially ActiveX) but it was the fact on the ground. One or more of those plugin technologies was installed on better than 97% of browsers, and they were in wide use.  You didn’t really have a choice: by 2003, I would guess 10-15% of the Web was unusable without Flash, because so many sites used it for their most basic navigation features. 

While Java and Flash were designed with security in mind, they were also designed to compete for developers.  As long as security met an only casually scrutinized minimum, what developers were really interested in was features.  How much would it let you do?  Around this time I coined a maxim after John Gilmore's more famous one: “Developers on the web interpret security restrictions as damage and route around them.” And once these technologies had signed on developers, the platforms had very little security pressure on them.  If you as a developer became unhappy with the security flaws of your platform, the cost to switch was incredibly high – you had to rewrite your entire application.  As a consumer, it was even worse – you couldn’t switch, you had to live with the platform choices of application authors, or do without.

As for ActiveX?  Yes, it was a by-design arbitrary code execution technology, unlike the sandboxed Java and Flash runtimes.  But it’s still important to have it there, stacked up against HTML5, because it was what people used to deliver Rich Web Applications.  There were lots of legitimate ActiveX controls, and lots of legitimate sites trained users to accept ActiveX prompts (when the browser prompted at all) a little too readily.  But even legitimate ActiveX controls were far more dangerous than Flash or Java.  Most were just wrappers around a big blob of unsafe legacy code, never designed to be deployed in a hostile environment.  Outside of those I looked at written by Microsoft itself, in my years of pentesting I never encountered an ActiveX control that didn’t fall over in the first five minutes of fuzzing, and I almost never encountered one that was site-locked.  This meant that even if you were careful about what controls you approved, the ones you’d accepted as safe to use in a non-malicious web page or that were from trusted authors could still be silently instantiated on any malicious page and trivially exploited with the most basic stack-based buffer overflows.

And the browsers themselves were little better.  IE 6 was notorious Swiss-cheese, and the reality behind it was really far worse than probably almost anyone realizes.  Remember, the browser wars had led to the same market pressures as those affecting plugin technologies.  Build more features, lock developers in, expose as many APIs as possible, integrate as deeply as you can with the OS and do it fast, fast, fast.   Brendan Eich has told the story many times of how, at Mozilla, he only had 10 days to design and implement JavaScript and, despite its brilliance, he is the first to admit we’re still also living uneasily with its mistakes today. That story is just one famous example of the state of the whole industry in the 90’s.  There was little or no time for security review, no concern for attack surface, and once anything went into the customers’ hands, it was very hard to claw back.

It was only after drive-by installs basically destroying customers’ computers – like my Aunt’s – threatened Microsoft’s business to the core that they really had the courage to start to roll things back and break things as they built IE7 and Vista.  I was there for some of that,  so I can say, “wow”.  It was heroic work, and the depths of the attack surface of IE6 were astounding. It’s one of the reasons why IE 7 on Vista was the first major browser to have a serious sandbox – it was the only way to cope with the complexity.  Does anyone remember the “Explorer view” that let you see a web directory listing as if it were part of your local filesystem in Windows Explorer?  That integration to
the core OS shell meant there was hardly a single line of code in the OS you couldn’t reach with tainted data from a random web page.  Or have you heard of “binary behaviors” – ActiveX controls that could be silently attached to CSS properties?  Microsoft didn’t just remove that one, they scrubbed any mention of it from MSDN, to boot.  And take a look through the stuff that only gets enabled in the “trusted zone” today to see some of the more dangerous things that were available everywhere in IE5 and 6.  Netscape, for its part, was better only by virtue of being OS-agnostic – it was still turning out its own dangerous features, many of which are just now being removed as it
implements HTML5.

But if IE7 was the start of a turnaround, it was only a start.  Plugins remained ubiquitous, and the rest of the Web caught mashup mania, which pulled us in another dangerous direction.   To give one example, before it was acquired, WebEx claimed its flagship Connect product was going to revolutionize the industry by providing a mashup environment that was basically a wrapper around IE that removed the Same-Origin Policy.  This sounds absurd now, but they were far from the only company putting serious effort into this kind of thing – remember, “developers on the Web interpret security restrictions as damage and route around them.”

The real force that changed things with Web and Browser security came from something widely recognized as revolutionary, but not in this particular domain:  the iPhone, when Apple declared that they would not allow Flash, Java or other plugins on the platform.  Though they claim it was for security and reliability reasons, let’s not kid ourselves – these were ways to get content and applications onto a closed platform without paying Apple – along with jailbreaking, which was the reason Apple finally started paying serious attention to platform security.  Despite these selfish motives, in the end, this turned out to be the start of the best thing that’s ever happened to Web security.  Because even with apps to fill some of the gap, everyone still wanted the Rich Web on their iPhone.

And the coolness of the iPhone meant that every developer wanted to target the platform – so they had to get serious about looking for ways to do it that didn’t involve plugins.  From this grew much of HTML5’s momentum to create a standards-based platform for the Rich Web.  And with Google’s Chrome entering the market shortly after, we had the start of a “new browser war”, but a war that looked very different than the first one – because it was driven by standards.

And what does it mean to have an open, standards-driven platform as the “ground rules” for a new browser war?  It means that browsers are competing on how fast and how well they implement those standards, how fast the browsers themselves are, and how secure they are.  In particular, they have to answer to users’ security concerns, much more than do plugin vendors, because when all web apps work in all browsers, the cost to the consumer of switching if they are impacted, or even if they hear that a browser is insecure, is very low.

And standards authors are not beholden to individual customers of their features – they are willing to break things – like the example of WebSockets, where public research revealed serious vulnerabilities that necessitated a fix that broke every existing application and implementation.  We simply never saw that kind of thing happen in the first browser wars.  And it is also noteworthy that, while the standards process is often derided for its slowness, a deliberate pace combined with public review means that HTML5 specs have gotten better security scrutiny than any browser features have ever before.  The HTML5 family was not only better designed, learning from the lessons of Rich Web 1.0, but even the new ground it broke was subject to incredible advance security review by the best experts in the industry.  That’s something that just can’t happen when you give one guy in one company 10 days to design and implement something.

So, what’s our scorecard finally look like?  Ten years ago, IE 6 had something like 80% market share, was full of trivially exploitable memory corruption flaws, un-sandboxed, and deeply wormed into the OS, exposing tens of millions of lines of unhardened code to the Web attack surface.  Nearly all the features people worry about with HTML5 were already implemented, in multiple plugin systems that had >95% penetration, each with a different security model, all also un-sandboxed, generally with worse code quality than the browser itself.  Beyond that, users frequently used ad-hoc blobs of unsafe code that ran with full privileges and was almost always highly vulnerable to trivial exploit.  And all the competitive pressures at the time were making things worse, not better. The result for users was clear – constant vulnerability and systems infected on a daily basis by adware, malware and botnets.

Today, largely thanks to HTML5, for the first time since Netscape 2 we have a large number of users browsing the web in environments that don’t support binary plugins at all.  Rich Web application authors write and deliver their code in a memory-safe language, JavaScript, that lives inside a Same-Origin Policy sandbox, that further lives in a browser sandbox.  The old plugin systems are still there on the desktop, and still the source of many of the worst vulnerabilities – but they are fading fast.  Little new content is being developed for them and much existing content is being converted to reach the new mobile audiences. 

App developers now have one programming and one security model to learn to write secure apps.  It’s more complex than HTML4, but less complex than HTML4 + Java + Flash + ActiveX.  And it’s designed to be secure – compare WebSockets to their Java equivalent, or CORS to its Flash and Silverlight counterparts and the difference is clear. (Scott Stender and I gave a talk that goes into much more detail on writing secure apps with HTML5 at the first W3Conf, watch it here:

All major browsers now have rapid-self update systems that keep them patched, and users can switch browsers for security improvements at little or no cost.  Instead of every penny-ante spammer putting drive-by malware on your system, Google finds it has to pay $60,000 and offer a lot of public prestige to get bugs turned in through their Pwnium events.  On the black market, such bugs fetch six figures.

But wait, you say, I hear about these bugs in the news every couple of weeks!  Exactly!  This kind of bug is man-bites-dog now.  Nobody would’ve bothered to write a breathless news story about a memory corruption crasher in Netscape 6 – after all, who would listen or care when browsers crashed hourly from non-malicious input, anyway?

So, security people, stop the fear and hate for HTML5, because what we witnessed in the browser world in the last decade was, honestly, a miracle.  We saw a class of applications grow exponentially in their use and complexity, while at the same time not just fixing bugs and becoming practically more secure – but drastically and systematically reducing their attack surface.  I don’t think that’s ever happened before – functionality and attack surface moving in opposite directions, so dramatically, with so much momentum, and for so long.  And it didn’t take any regulatory or agency incentives – it began with self-motivated business decisions by companies like Microsoft, Apple and Google, and the open standards process turned it into a virtuous cycle for users and the ecosystem as a whole.

So, if you’ve been worried about the security implications of HTML5, don’t worry – it’s here to help, and it already has. We still have a long way to go in securing the web, and we must remain vigilant and revise and improve standards as we learn, but we are in a much better position than we were ten years ago.


-Brad Hill

*** This is a Security Bloggers Network syndicated blog from The Security Practice authored by Brad Hill. Read the original post at: