summaryrefslogtreecommitdiff
path: root/ars-technica/published/https.txt
blob: 679f5bd37ad9eea002df9267259381a787f3d64a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
Google, Mozilla, the EFF and others are, and have been for some time, pushing for websites to adopt HTTPS. That push is about to get a boost from Mozilla and Google when both companies' web browsers begin to actively call out insecure websites.

HTTPS has been around nearly as long as the web, but it's primarily used by sites that handle money -- your bank's website, shopping carts, social networks and webmail services like Gmail. The extra "S" in an HTTPS URL means your connection is secure and it's much harder for anyone else to see what you're doing. 

On today's web everyone wants to see what you're doing. And as long as you're using HTTP, they can.

Changing the web over to HTTPS will not get rid of tracking cookies, nor will it stop nation states with the resources to launch hardware-based attacks. 

HTTPS will, however, stop some of the mass surveillance that currently happens on the web. It will stop your ISP from injecting code to track you, it will stop unknown parties from using your browser to launch DDoS attacks as you browse and it stops ISP and nation states from censoring specific pages they don't like.

Moving the bulk of the web from HTTP, which is an unencrypted connection that anyone can intercept, record and even manipulate, to HTTPS, which is encrypted and (reasonably) secure, is a big win for the web, which is to say it's a win for the users of the web.

This is important to bear in mind because it's also a win for some big companies that like to tout that it's a win for the web without mentioning that it also protects their bottom line. More on that in a minute.

Changing the web to HTTPS is not, however, entirely without costs and challenges for both web users and website owners. 

The question is, do the benefits justify the costs? To answer that we have to first look at what HTTPS gets us and what it costs.

## What HTTPS Does For You

Any secure protocol offers three things to the user: secrecy, integrity and authenticity. In the case of HTTPS the first two come from encryption. The major benefit of HTTPS is encryption. It provides authenticity as well, but currently authenticity is its weak point.

As the EFF's Jacob Hoffman-Andrews, lead developer on Let's Encrypt, tells Ars, encryption is a "necessary minimum bar" for today's web. "if we were designing the internet from scratch today," he says, "we would say encryption is cheap and easy, there's no export [restrictions anymore](https://en.wikipedia.org/wiki/Clipper_chip), so it will be default and you won't have to worry about it."

When your browser connects to a website over HTTPS the connection from your browser to the page you want to view is encrypted. That means any data exchanged is not visible to anyone else snooping the network. Without the encryption it's easy to perform what's known as a man in the middle attack. 

A simplified way to think about this is to think about the connection you made to get this page. When your browser requests http://arstechnica.com it sends that request out to the Ars server which then sends the requested page back as a stream of packets that your browser assembles into the page you requested. 

Both the request and the response are just plain text bit of data. All a man in the middle attack does is step into that stream of data and start reading and manipulating it. If your ISP wanted to add an advertisement to this page that requires you to click on it before reading the story, it could do that by just injecting a few packets of its own. You would have no way of knowing whether that ad came from Ars or some other source. Anyone could in fact do just about anything to the data traveling between the Ars server and your browser, including serving up an entirely different page or not showing the page at all. 

This is not a theoretical problem, the man in the middle code injection is an active, widely used attack. In some cases it's even a business model.

The list of examples here is too long to cover in such a short space, but there are a few that deserve mention. The first is Verizon Wireless's so called Perma-Cookie. Verizon Wireless modifies traffic on its network to inject a tracker (it added an HTTP header called X-UIDH) that is then sent to all unencrypted sites that Verizon customers visit. This allows Verizon to, in the [words of the EFF](https://www.eff.org/deeplinks/2014/11/verizon-x-uidh), "assemble a deep, permanent profile of visitors' web browsing habits without their consent". 

Verizon is not alone. It's a safe bet that your ISP is doing something similar. Comcast's wifi service [already does](http://arstechnica.com/tech-policy/2014/09/why-comcasts-javascript-ad-injections-threaten-security-net-neutrality/), as does [AT&T's](http://arstechnica.com/information-technology/2015/03/atts-plan-to-watch-your-web-browsing-and-what-you-can-do-about-it/3/) (you can opt out, for a fee). What your ISP does with this data is less well known, but it's a big part of why Google wants the web to move to HTTPS.

When you communicate in plain text over the network you have to assume that someone is, at the very least watching and very probably injecting some tracking code to record your requests.

An encrypted connection on the other hand is not plain text anyone can read, it's encrypted text. There is no way to read or manipulate cypher text without the encryption keys. Score one for HTTPS, which can guarantee that you are getting the content your browser requested.

HTTPS also prevents the kind of censorship that happens at the state or ISP level. Examples of this abound as well, for example, Russia wanted to ban a Wikipedia article (about [charas hashish](https://en.wikipedia.org/wiki/Charas)), but because Wikipedia is served over HTTPS there's no way to see which page visitors are requesting. Russia was faced with the choice: ban all of Wikipedia or none. It [opted for none](https://www.eff.org/deeplinks/2015/08/russias-wikipedia-ban-buckles-under-https-encryption). 

Score another one for HTTPS, because as it turns out unencrypted networks do not, as early web enthusiasts liked to say, "see censorship as damage and route around it". In fact unencrypted networks make censorship very easy, just reach in and block what you want, change what you want. But with HTTPS the network doesn't actually see anything and that's a good thing.

Having an HTTPS connection offers one other thing that benefits users -- authentication. 

Knowing that no one else on the network can read or tamper with your traffic is good,  that gets you secrecy and integrity. But you also want to verify that the site you're visiting is actually the site you want to visit. 

To authenticate your connection to the site you're trying to visit your browser maintains a list of known, trusted certificate authorities. When your browser requests a secure page it gets the page's security certificate, which contains a chain that leads back to a certificate authority. If that authority matches an authority known to your browser then your browser will trust that the site you're connecting to is who it claims to be. If that sounds a bit weak to you, you're not alone. This is currently, the biggest problem with HTTPS.

Behind the scenes what handles all the encryption and authentication is a bit of technology known as TLS, which is short for Transport Layer Security. In fact the full name of HTTPS is really HTTP over TLS. TLS is the successor to the now vulnerable Secure Sockets Layer (SSL), though to further complicate things you will often hear both referred to as "SSL". In the context of this article, HTTPS will refer to TLS connections.

TLS is made up of two layers, the TLS Record Protocol and the TLS Handshake Protocol. Together these two tools allow your web browser to securely connect to a validated site and encrypt all your communications thereafter.

Now that you know what HTTPS offers -- encryption and authentication -- it should hopefully be easy to see why your bank uses it, why Gmail, Facebook, Twitter and any other site you log in to uses it, or should (if you log in into to site without HTTPS, stop visiting that site).

What's less immediately obvious is why *every* site on the web can benefit from HTTPS. How does HTTPS help some long archived, no longer maintained bit of ephemera from the early web? 

The answer is in many cases is it doesn't help. It does not benefit the site or its creator directly in many tangible ways. 

It does, however, benefit the user connecting to the site, since they now know that what they see is actually data they requested from the site they wanted to visit (integrity and authenticity).

There's another beneficiary as well -- the network as a whole, and by extension, all of us using it.

Still, while there are clear benefits to HTTPS, it is not entirely without costs.

## What HTTPS Costs

There has been some push back against the effort to push the web to all HTTPS, all the time. Most of the critics are worried about all the content out there that will never be ported to HTTPS -- what happens to it? Will HTTPS cost us the entirety of the early internet?

Read through Mozilla's bug report on the subject and you'll find quite a few people talking about this content as if it were somehow tainted. "It's time we start treating insecure connections as a Bug," writes one Mozilla developer on a bug report entitled "Switch generic icon to negative feedback for non-https sites." Mozilla is a big company, with many different voices, but even Mozilla's Richard Barnes, who is one of the main proponents of HTTPS (and editor of several specs at the W3C related to it), told me "to be completely frank, I don't care about URLs I care about secure connections."

The URLs Barnes is referring to is part of the debate surrounding HTTP vs HTTPS -- is HTTPS the answer or is there a way to upgrade HTTP? In the end though Barnes just wants to make sure that the web is secure and he's not alone. The Chromium project has similar bug threads and outspoken HTTPS proponents.

Fortunately for us the web is not Mozilla's, not Google's, not even the W3C's. The web belongs to everyone who uses it and creates things for it.

Software developer and blogger Dave Winer [calls HTTPS](http://scripting.com/liveblog/users/davewiner/2015/12/18/0667.html) "expensive security theater". Winer writes, "I have a couple dozen sites that are just archives of projects that were completed a long time ago. I'm one person. I don't need make-work projects, I like to create new stuff, I don't need to make Google or Mozilla or the EFF or Nieman Lab happy."

Winer is not alone. In fact he's in very good company, no less than Tim Berners-Lee has [questioned the move to HTTPS](https://www.w3.org/DesignIssues/Security-NotTheS.html), going so far as to call it "arguably a greater threat to the integrity for the web than anything else in its history". Berners-Lee does think the web should be encrypted, he just doesn't like the way it's currently being done. Berners-Lee would like to see HTTP upgraded rather than shifting to the HTTPS protocol.

There are two massive costs to HTTPS that have to be borne by users. 

The first is the one Winer is concerned about -- HTTPS adds significant complications to the process of setting up a website. The second is the one Berners-Lee is concerned about, we risk breaking links, billions of links.

It's easy for savvy developers to dismiss the first problem, that HTTPS adds considerable complexity. But what makes the web great is that you don't have to be a savvy developer to be a part of it. Anyone with a few dollars a month to spare can rent their own server space somewhere, throw some HTML files in a folder and publish their thoughts on the web. A few dollars more gets you an nice URL, but that's not strictly necessary. 

Requiring sites to include a security certificate adds a significant barrier to entry to the web. 

Anyone who has put in the effort to get HTTPS working on even one site knows that it can be a tremendous hassle. Indeed this is probably the biggest obstacle to widespread HTTPS adoption among small site operators (that is, the bulk of the web).

Perhaps the most difficult part is actually obtaining a certificate, which, until very recently, was neither easy nor free.

Until the last six months there were only a handful of certificate authorities which did not charge to issue certificates -- the best known being StartSSL -- but if you ever needed to revoke your certificate for some reason there was a fee. In other words, the certificates are only free if nothing ever goes wrong. If something does go wrong and you need to revoke a dozen or more sites, these "free" certificates quickly get expensive.

This is one of the first problems that HTTPS proponents set out to solve. The EFF and Mozilla have partnered with some other big names to create Let's Encrypt, which offers free certificates -- yes, really free, no catches and you don't have to provide any identifying information to get one.  There's also a set of command line tools that make installing and configuring them pretty simple provided you have some basic sysadmin knowledge (and SSH access to your server).

That's not the end of the headache though. Once you have a certificate you have to install it and get your web server to serve it up properly. Again, assuming you have a basic sysadmin's knowledge this isn't too hard, though tweaking it until you get a A+ grade on [SSLLab's test](https://www.ssllabs.com/ssltest/) can take many hours (and even top sites like [Facebook only score a B](https://www.ssllabs.com/ssltest/analyze.html?d=facebook.com)). I've been running my own website, building my own CMSes and running servers on the web for fifteen years and I can say without hesitation that getting HTTPS working on my site was the hardest thing I've done on the web. It was hard enough that, like Winer, I haven't bothered with old archived sites.

Over the long run Let's Encrypt is hoping to partner with popular web hosts in such away that users looking to set up their own blog using popular CMS like WordPress get an HTTPS site up and running as easily as clicking a button. Things will, however, likely never be that simple for anyone who wants to take a more DIY approach, writing their own software. 

Simplifying the process of setting up HTTPS makes the individual more dependent on tools build by others. Developer Ben Klemens has an [essay](https://medium.com/@b_k/https-the-end-of-an-era-c106acded474#.orxikg4xp) about exactly this dependency, writing that if "solving the problem consists of just starting a tool up, my sense of wonder has gone from 'Look what I did' to 'Look what these other people did', which is time-efficient but not especially fun." 

It may seem trivial to developers employed by large companies solving complicated problems that taking the fun out of the web is a problem, but it is. If the web stops being fun for individuals it becomes solely the province of those companies. We are no longer creators of the web, but simple users.

Berners-Lee's caution is more immediately practical -- what happens to all those links to HTTP sites when all those sites become HTTPS? The answer is they break. There are quite a few proposals that would mitigate some of this at the browser level. When I asked Mozilla's Barnes about Berners-Lee's concerns he told me, "Tim has been a really useful contrarian voice. His views have driven the browser and web community to address concerns he has raised". 

To prove that Barnes actually does care about URLs, he's the co-editor of a W3C specification that aims to preserve all those old links and upgrade them to HTTPS. The spec is known as [HTST priming](https://mikewest.github.io/hsts-priming/) and it works with another proposed standard known as [Upgrade Insecure Requests](https://www.w3.org/TR/upgrade-insecure-requests/) to offer the web a kind of upgrade path around the link rot that Berners-Lee fears. 

With Upgrade Insecure Requests site authors could tell a browser that they intend all resources to be loaded over HTTPS, even if the link is HTTP. This solves the legacy content problem, particularly in cases where the content can't be updated, for example, The New York Times [archived sites](http://open.blogs.nytimes.com/2014/11/13/embracing-https/).

Both of these proposals are still very early drafts, but they would, if implemented, provide a way around one of the biggest problems with HTTPS -- breaking links.

At least some of the time. Totally abandoned content will never be upgraded to HTTPS, neither will content where the authors, like Winer, elect not to. This isn't a huge problem though because browsers will still happily load the insecure content. 

What Winer and others fear is that at some point browsers may stop loading HTTP content entirely. For now that's still a ways off, but Mozilla's plans make it clear that it is part of the future of Firefox. Mozilla's [FAQ](https://blog.mozilla.org/security/files/2015/05/HTTPS-FAQ.pdf) on the subject reads: "Q: Does this mean my unencrypted site will stop working? Not for a long time."

While browsers ceasing to load HTTP sites at all is wrong, as Winer puts it "the browser is broken. It has totally the wrong idea of its role."

At the same time, as developer and HTTPS proponent Eric Mill [writes](https://konklone.com/post/were-deprecating-http-and-its-going-to-be-okay), "we're deprecating HTTP and it's going to be okay."

## Why We Should Encrypt All The Things

The web needs encryption because the web's visitors need it. The web needs encryption because the network needs it to remain neutral. The web needs encryption because without it just browsing can turn you into an unwitting helper in a DDoS attack.

Several years ago I wrote a piece on the then nascent effort to get HTTPS more widely adopted. At the time I [wrote](http://arstechnica.com/business/2011/03/https-is-more-secure-so-why-isnt-the-web-using-it/) "For sites that don't have any reason to encrypt anything... HTTPS just doesn't make sense."

That was then. Now I think it does make sense to encrypt everything. 

In 2011 when I wrote that the network of the web looked fairly benign (as Snowden's leaks revealed, it was not, but most of us had no way to know back then). Since that time the network has become hostile, incredibly hostile. 

As Mill recently wrote, "I see companies and government asserting themselves over their network. I see a network that is not just overseen, but actively hostile. I see an internet being steadily drained of its promise to "interpret censorship as damage'...In short, I see power moving away from the leafs and devolving back into the center, where power has been used to living for thousands of years."

Lack of encryption has created a web that's no longer in the user's control. The web has become a broad surveillance tool for everyone from the NSA to Google to Verizon. 

As Mill writes, without encryption the network becomes a tool for whoever owns the largest nodes. We the people, the small creators of this thing we call web are not just at the mercy of the network owners, we've the victims of their whims.

My personal website does not ask you to log in, it loads no third-party scripts, ad networks or any other code. Yet without encryption I have no way to ensure that some other party isn't inserting code of their own. As Hoffman-Andrews says, anyone could "insert their own ads, their own tracking cookies, they can insert malware and do their own tracking". In other words, I would like to make sure no one is tracking you when you visit my site, and that you see no ads, but I can't. Unless I use HTTPS.

Think no one is doing that to your site? Think again. ISPs are and will likely be doing more of this in the future, particularly mobile service providers. Their primary responsibility is to their shareholders and it would negligent of them to not increase profits by increasing tracking.

It's worth noting here that this kind of manipulation is very likely at the heart of Google's love of HTTPS. Google did not respond to my inquires for this article, but it's a kind of open secret that ISPs harvest search queries. Without HTTPS it's pretty easy for ISPs to track not just search queries but which results users clicked on, which is vital information for building a better search engine. In other words, info Google would prefer its potential competitors don't get.

Winer calls out Google specifically and he's not the only one to do so. Yes, Google is acting in its own best interests and Winer is right to question the motives of a company so massive it has the power to [potentially control elections](https://aeon.co/essays/how-the-internet-flips-elections-and-alters-our-thoughts). However, in this case, Google's interests are aligned with the web at large (for now). Google doesn't want that data captured and sold, but remember that data is actually about you. It's your data first and foremost and regardless of what you think about Google gathering it, you certainly don't want it bought and sold by others.

The flip side to this is that if your site does serve up ads and you want to make sure that no one is stripping out those ads -- which, with companies like [Shine](https://www.getshine.com/), is starting to happen at the network level -- HTTPS is also your friend.

The second and considerably more alarming network attack that's possible without HTTPS is what's become known as [Great Cannon](http://arstechnica.com/security/2015/04/meet-great-cannon-the-man-in-the-middle-weapon-china-used-on-github/). Great Cannon is a very sophisticated attack, for full details see Citizen Lab's [write-up](https://citizenlab.org/2015/04/chinas-great-cannon/), but the short story is that someone hijacked a bit of JavaScript served up by Chinese search giant Baidu and added a payload to it that made frequent requests to a target website. Great Cannon essentially turned unsuspecting browsers into part of DDoS attack. 

This is what Mill means when he says the network is actively hostile. With Great Cannon it becomes so hostile it turns you, unknowingly, into a DDoS attacker.

The only way to stop attacks like Great Cannon, or network tampering like what Verizon and others are doing, is to encrypt your traffic. This is why the web needs HTTPS.

Which brings us back to today. HTTPS is becoming more and more common, easier and easier for anyone to get up and running. Where does it go from here?

## What Happens Next

What happens next is that browser vendors are going to start pushing the web to HTTPS by limiting what HTTP sites can do and changing the URL icons from positive feedback to negative feedback. The carrot is being replaced by the stick.

The Chromium project has already announced plans to [mark HTTP connections as insecure](https://www.chromium.org/Home/chromium-security/marking-http-as-non-secure). Mozilla will do [roughly the same](https://blog.mozilla.org/security/2015/04/30/deprecating-non-secure-http/) with Firefox. Both also plan to limit many HTML APIs to HTTPS only, starting with the geo-location APIs, hardware access APIs and anything else that would be a security risk over unsecured connections.

The icon change will eventually mean that browsers show nothing at all for secure sites and display a large red X in the URL bar when you visit an HTTP site.

It's not difficult to imagine a day and age when browsers treat HTTP sites they way the treat suspected malware sites now and simply not load them. To be clear, that's not happening right now. But it would be foolish to assume that it never will. 

It's tempting to see this as hostile to publishers -- the message has become fall in line with HTTPS or, as Winer writes, the browsers will "make sure everyone knows you're not to be trusted." 

However, what the broken lock is really saying is that your browser can't guarantee that the content you're reading hasn't been tampered with. It also can't guarantee that you aren't currently part of a DDoS attack against a site you've never even heard of. It also can't guarantee that you're connected to the site you think you're connected to.

All of these things have always been true when you connect to an HTTP site, the only thing that's changing is that your browser is telling you about it. So long as browsers stop there the current plan seems well-suited to bringing more security to the web. 

Giving users greater secrecy, ensuring data integrity in transit, and providing a means (flawed though it may be) of establishing authenticity empower the user and help make the network decidedly less hostile than it is right now. Abuse will still happen. Surveillance will still be possible but, as Mill notes, attacks will "change from bulk to targeted" and the network can return to being just a dumb pipe.

It would an egregious abuse of their place in the web ecosystem for browsers to stop loading HTTP content entirely, but so far that's not happening. If it does, the web should resist it. Warnings help users make informed decisions, prohibitions help no one.

The web has always been a messy, complicated thing the last thing it needs now is an artificial binary construct of "good" and "bad" as determined by browser vendors.