ars-technica/published/https-take2-final.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117

There's a major change sweeping the web. The familiar HTTP prefix is rapidly being replaced by HTTPS. The extra "S" in an HTTPS URL means your connection is secure and it's much harder for anyone else to see what you're doing. And on today's web everyone wants to see what you're doing. 

HTTPS has been around nearly as long as the web, but it's primarily used by sites that handle money -- your bank's website, shopping carts, social networks and webmail services like Gmail. 

Now Google, Mozilla, the EFF and others want every website to adopt HTTPS. The push for HTTPS everywhere is about to get a big boost from Mozilla and Google when both companies' web browsers begin to actively call out sites that still use HTTP.

The plan is for browsers to start labeling HTTP connections as insecure. In other words, instead of the green lock icon that indicates a connection is secure today, there will be a red icon to indicate when a connection is insecure. Eventually secure connections would not be labeled at all, they would be the assumed default.

Google has also been pushing HTTPS connections by "[using HTTPS as a ranking signal](https://webmasters.googleblog.com/2014/08/https-as-ranking-signal.html)". Google takes the security of a connection (or lack thereof) into consideration when ranking sites in search results. For the time being Google says that HTTPS is "a very lightweight signal... carrying less weight than other signals such as high-quality content". However, the company says that that it "may decide to strengthen it" as a means to encourage more sites to adopt it.

Through efforts like these HTTPS has already moved beyond the obvious realms like banking and webmail. Popular sites like Facebook, Twitter, YouTube, as well as major online retailers like Target, Home Depot and Adobe, are all served over HTTPS.

Target, Home Depot and Adobe were not examples chosen at random though; all three have had major data breaches that exposed identifying information about users. 

HTTPS does not mean your *data* is secure, it just means your *connection* is secure. This is not semantics. It's critical for users to understand and unfortunately HTTPS advocates sometimes present HTTPS as synonymous with "security". The phrase "secure web" gets used a lot in discussions, but as those three retailers illustrate, using HTTPS does not mean a website is secure. In fact, HTTPS says nothing about the website, the server it resides on or what happens to whatever data you might give it. 

This may ultimately be the biggest challenge HTTPS faces -- helping people to understand what it means.

## What's So Great About Encryption, TLS and Authenticity?

If HTTPS is no guarantee of security, what does it do for you? HTTPS offers three things: secrecy, integrity and authenticity. 

The simplest of these is secrecy. HTTPS uses encryption to make sure that no one can see the data that's transmitted over the wire. When your browser connects to a website over HTTPS the connection from your browser to the page you want to view is encrypted. That means any data exchanged is not visible to anyone else snooping the network. 

The EFF's Jacob Hoffman-Andrews, lead developer on Let's Encrypt, a new tool that offers free HTTPS certificates, tells Ars, encryption is a "necessary minimum bar" for today's web. "If we were designing the internet from scratch today," he says, "we would say encryption is cheap and easy, there's no export [restrictions anymore](https://en.wikipedia.org/wiki/Clipper_chip), so it will be default and you won't have to worry about it."

Without the encryption it's easy for anyone to see everything you ask for and everything the site sends back. That allows anyone who wants to to perform what's known as a Man in the Middle attack.

With an unencrypted connection both your browser's request and the server's response are just plain text bits of data. All a Man in the Middle attack does is step into that stream of data and start reading and manipulating it. If your ISP wanted to add an advertisement to this page that requires you to click on it before reading the story, it could do that by just injecting a few packets of its own. You would have no way of knowing whether that ad came from Ars or some other source. Anyone could in fact do just about anything to the data traveling between the Ars server and your browser, including serving up an entirely different page or not showing the page at all. 

This is not a theoretical problem, Man in the Middle code injection is an active, widely used attack. In the case of Verizon Wireless's so called "Perma-Cookie", it's even a business model.

Using a Man in the Middle attack, Verizon Wireless modifies traffic on its network to inject a tracker (it added an HTTP header called X-UIDH) that is then sent to all unencrypted sites that Verizon customers visit. This allows Verizon to, in the [words of the EFF](https://www.eff.org/deeplinks/2014/11/verizon-x-uidh), "assemble a deep, permanent profile of visitors' web browsing habits without their consent". 

Verizon is not alone. It's a safe bet that your ISP is doing something similar. Comcast's wifi service [already does](http://arstechnica.com/tech-policy/2014/09/why-comcasts-javascript-ad-injections-threaten-security-net-neutrality/), as does [AT&T's](http://arstechnica.com/information-technology/2015/03/atts-plan-to-watch-your-web-browsing-and-what-you-can-do-about-it/3/) (you can opt out, for a fee). What your ISP does with this data is less well known, but it's a big part of why Google wants the web to move to HTTPS.

When you communicate in plain text over the network you have to assume that someone is, at the very least watching and very probably injecting some tracking code to record your requests.

With the encrypted connection you get when a site uses HTTPS the transmitted data is very difficult to read. There is no way to read or manipulate cypher text without the encryption keys. Score one for HTTPS, which can guarantee that you are getting the content your browser requested.

HTTPS also prevents the kind of censorship that happens at the state or ISP level. Examples of this abound as well, for example, Russia wanted to ban a Wikipedia article (about [charas hashish](https://en.wikipedia.org/wiki/Charas)), but because Wikipedia is served over HTTPS there's no way to see which page visitors are requesting. Russia was faced with the choice: ban all of Wikipedia or none. It [opted for none](https://www.eff.org/deeplinks/2015/08/russias-wikipedia-ban-buckles-under-https-encryption). 

Score another one for HTTPS, because as it turns out unencrypted networks do not, as early web enthusiasts liked to say, "see censorship as damage and route around it". In fact, unencrypted networks make censorship very easy, just reach in and block what you want, change what you want. 

Put all this together and you discover that the web, the network on which your data travels is not just insecure, but actively hostile. As developer and HTTPS proponent Eric Mill writes, "I see companies and government asserting themselves over their network. I see a network that is not just overseen, but actively hostile. I see an internet being steadily drained of its promise to 'interpret censorship as damage'...In short, I see power moving away from the leafs and devolving back into the center, where power has been used to living for thousands of years."

It's getting worse too. A considerably more alarming network attack has come to light in the last year that exploits the lack of HTTPS on the web to create distributed DDoS attacks using unsuspecting users who never know they're part of an attack. [Great Cannon](http://arstechnica.com/security/2015/04/meet-great-cannon-the-man-in-the-middle-weapon-china-used-on-github/), as this attack has been dubbed, is a very sophisticated attack. For full details see Citizen Lab's [write-up](https://citizenlab.org/2015/04/chinas-great-cannon/), but the short story is that someone hijacked a bit of JavaScript served up by Chinese search giant Baidu and added a payload to it that made frequent requests to a target website. Everyone visiting Baidu who loaded that script became part of the attack.

This is what Mill means when he says the network is actively hostile. With Great Cannon it becomes so hostile it turns you, unknowingly, into a DDoS attacker. The only way to stop attacks like Great Cannon, or network tampering like what Verizon and others are doing, is to encrypt your traffic. 

The last thing HTTPS provides is authentication. The site you're visiting is verified by the browser as actually being that site and not some imposter. To authenticate your connection web browsers maintain a list of known, trusted certificate authorities. When your browser requests a page it gets the page's security certificate, which contains a chain that leads back to a certificate authority. If that authority matches an authority known to your browser then your browser will trust that the site you're connecting to is who it claims to be. 

Now that you know what HTTPS offers -- encryption, integrity  and authentication -- it should hopefully be easy to see why your bank uses it, why Gmail, Facebook, Twitter and any other sites you log in to use it, or should. What's less immediately obvious is why *every* site on the web can benefit from HTTPS. Does HTTPS help some long archived, no longer maintained bit of ephemera from the early web? 

Software developer and blogger Dave Winer argues in a post entitled [HTTPS is expensive security theater](http://scripting.com/liveblog/users/davewiner/2015/12/18/0667.html), that not only does it not help old, archived sites, it's a waste of the site owner's time. "I have a couple dozen sites that are just archives of projects that were completed a long time ago," writes Winer. "I'm one person. I don't need make-work projects, I like to create new stuff, I don't need to make Google or Mozilla or the EFF or Nieman Lab happy."

Winer is not alone. In fact he's in very good company, no less than Tim Berners-Lee has [questioned the move to HTTPS](https://www.w3.org/DesignIssues/Security-NotTheS.html), going so far as to call it "arguably a greater threat to the integrity for the web than anything else in its history". Berners-Lee does think the web should be encrypted, he just doesn't like the way it's currently being done. Berners-Lee would like to see HTTP upgraded rather than shifting to the HTTPS protocol.

Winer and Berners-Lee highlight the two big potential problems of moving the web to HTTPS. It significantly complications to the process of setting up a website and creating something on the web, and it might break links -- billions of links.

It's easy for savvy developers to dismiss the first problem, that HTTPS adds considerable complexity. But what makes the web great is that you don't have to be a savvy developer to be a part of it. Anyone with a few dollars a month to spare can rent their own server space somewhere, throw some HTML files in a folder and publish their thoughts on the web. A few dollars more gets you an nice URL, but that's not strictly necessary. 

Requiring sites to include a security certificate adds a significant barrier to entry to the web. 

Anyone who has put in the effort to get HTTPS working on even one site knows that it can be a tremendous hassle. Indeed this is probably the biggest obstacle to widespread HTTPS adoption among small site operators (that is, the bulk of the web).

Until very recently there was no way to obtain a free SSL certificate (a few certificate authorities did not charge to issue you a certificate, the if you needed to revoke it there way a fee). This was the first challenge that HTTPS proponents set out to solve. The EFF and Mozilla partnered to create Let's Encrypt, which now offers free certificates -- really free, no catches and you don't have to provide any identifying information to get one.  There's also a set of command line tools that make installing and configuring them pretty simple provided you have some basic sysadmin knowledge (and SSH access to your server).

That's not the end of the headache though. Once you have a certificate you have to install it and get your web server to serve it up properly. Again, assuming you have a basic sysadmin's knowledge this isn't too hard, though tweaking it until you get a A+ grade on [SSLLab's security test](https://www.ssllabs.com/ssltest/) can take many hours of debugging (and even top sites like [Facebook only score a B](https://www.ssllabs.com/ssltest/analyze.html?d=facebook.com)). I've been running my own website, building my own CMSes and running servers on the web for fifteen years and I can say without hesitation that getting HTTPS working on my site was the hardest thing I've done on the web. It was hard enough that, like Winer, I haven't bothered with old archived sites.

Over the long run Let's Encrypt is hoping to partner with popular web hosts in such a way that users looking to set up their own blog using popular CMS like WordPress get an HTTPS site up and running as easily as clicking a button. Things will, however, likely never be that simple for anyone who wants to take a more DIY approach, writing their own software. 

Simplifying the process of setting up HTTPS means more tools in your toolchain. It makes the individual more dependent on tools build by others. Developer Ben Klemens has an [essay](https://medium.com/@b_k/https-the-end-of-an-era-c106acded474#.orxikg4xp) about exactly this dependency, writing that if "solving the problem consists of just starting a tool up, my sense of wonder has gone from 'Look what I did' to 'Look what these other people did', which is time-efficient but not especially fun." 

It may seem trivial to developers employed by large companies solving complicated problems that taking the fun out of the web is a problem, but it is. If the web stops being fun for individuals it becomes solely the province of those companies. We are no longer creators of the web, but simple users. 

## Think of the Links

Berners-Lee's concerns about HTTPS are easier to fix -- what happens to all those links to HTTP sites when all those sites become HTTPS? The answer is they break. There are quite a few proposals that would mitigate some of this at the browser level. When I asked Mozilla's Barnes about Berners-Lee's concerns he told me, "Tim has been a really useful contrarian voice. His views have driven the browser and web community to address concerns he has raised". 

To prove that Barnes actually does care about URLs, he's the co-editor of a W3C specification that aims to preserve all those old links and upgrade them to HTTPS. The spec is known as [HTST priming](https://mikewest.github.io/hsts-priming/) and it works with another proposed standard known as [Upgrade Insecure Requests](https://www.w3.org/TR/upgrade-insecure-requests/) to offer the web a kind of upgrade path around the link rot that Berners-Lee fears. 

With Upgrade Insecure Requests site authors could tell a browser that they intend all resources to be loaded over HTTPS, even if the link is HTTP. This solves the legacy content problem, particularly in cases where the content can't be updated, for example, The New York Times [archived sites](http://open.blogs.nytimes.com/2014/11/13/embracing-https/).

Both of these proposals are still very early drafts, but they would, if implemented, provide a way around one of the biggest problems with HTTPS -- breaking links.

At least some of the time. Totally abandoned content will never be upgraded to HTTPS, neither will content where the authors, like Winer, elect not to. This isn't a huge problem though because browsers will still happily load the insecure content. For now anyway.

## More Honest Web Browsers

The web needs encryption because the web's users need it. The web needs encryption because the network needs it to remain neutral. The web needs encryption because without it just browsing can turn you into an unwitting helper in a DDoS attack.

There are a lot of companies pushing HTTPS, most have their own interests first but for now at least those interests align with web users' interests. None of these companies have the kind of power and influence that Google and, to a lesser degree, Mozilla have as browser makers. 

And it's up to browser makers to fix the confusion that currently surrounds HTTPS.

The current way browsers highlight HTTPS connections is misleading and needs to change.

The green lock icon that browsers use to denote a secure connection is too easily construed as a signal that the site is "secure". Labeling HTTPS sites "secure" and non-HTTPS sites "insecure" is deeply dishonest (just because a site uses HTTPS doesn't mean it's not storing your password and credit card number in plain text somewhere, and doesn't mean that it hasn't been hacked to serve malicious JavaScript and so on). As it stands browsers do not make clear that the lock icon is a statement about the connection *to* the site, and not the site itself.

As Hoffman-Andrews puts it, "calling HTTPS sites secure is generally not accurate, but it's definitely accurate to call HTTP sites insecure." In fact, browsers have no way of knowing if the site is truly "secure" in the broader sense. Neither do you and I. No one is every going to fix that. But browsers can fix what they show users.

The Chromium project has already announced plans to change the way it displays the lock and start [marking HTTP connections as insecure](https://www.chromium.org/Home/chromium-security/marking-http-as-non-secure). Mozilla will do [roughly the same](https://blog.mozilla.org/security/2015/04/30/deprecating-non-secure-http/) with Firefox. 

It's tempting to see this as hostile to publishers -- the message has become fall in line with HTTPS or, as Winer writes, the browsers will "make sure everyone knows you're not to be trusted." 

However, what the broken lock is really saying is that your browser can't guarantee that the content you're reading hasn't been tampered with. It also can't guarantee that you aren't currently part of a DDoS attack against a site you've never even heard of. It also can't guarantee that you're connected to the site you think you're connected to. All it can guarantee is that there is nothing secure about your connection and anyone could be doing anything to it.

All of these things have always been true when you connect to an HTTP site, the only thing that's changing is that your browser is telling you about it. 

The far more important change comes after that, when there will be no icon at all for HTTPS connections. All you'll ever see to indicate "security" is a large red X in the URL bar when you visit a site over HTTPS.

Winer's fear is that Google especially, because it has a financial interest in HTTPS (HTTPS prevents Google's competitors from scraping search results), will stop loading and ranking HTTP sites altogether. It would an egregious abuse of their place in the web ecosystem for any browser to stop loading HTTP content entirely, but so far that's not happening. If it does, if Google's self-interests are no longer aligned with the web's, then the web should resist it. Warnings help users make informed decisions, prohibitions help no one.

The web has always been a messy, complicated thing. The last thing it needs now is an artificial binary construct of "good" and "bad" as determined by browser vendors. At the same time, the current lack of encrypted connections has created a web that's no longer in the user's control. The web has become a broad surveillance tool for everyone from the NSA to Google to Verizon. Without encryption the network becomes a tool for whoever owns the largest nodes. We the people, the small creators of this thing we call web are not just at the mercy of the network owners, we're the victims of their whims.

Giving users greater secrecy, ensuring data integrity in transit, and providing a means of establishing authenticity empower the user and help make the network decidedly less hostile than it is right now. Abuse will still happen. Surveillance will still be possible but, as Mill notes, attacks will "change from bulk to targeted" and the network can return to being just a dumb pipe.