summaryrefslogtreecommitdiff
path: root/ubuntu-data-collection.txt
blob: e7b3e7fbc8d6a89952f645d04f2cde0b7bce0047 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
When Ubuntu 18.04 launched last month it included a new little welcome application that runs the first time you boot into your new install. The Welcome app does several things, provides a quick overview of the new GNOME interface, offers to set up Livepatch (for kernel patching without a reboot) and it offers to opt you out of Canonical's new data collection tool. In <a href="https://www.theregister.co.uk/2018/04/27/ubuntu_1804/">my review</a> I called the opt-out a ham-fisted decision, but did note that if Canonical wanted to actually gather data opt-out was probably the best choice. 

I did not anticipate the little firestorm of controversy this little data gathering tool was going to create within the Linux community. Unfortunately the controversy is built on misinformation and outright lying. I've seen several articles, blog posts and especially YouTube videos, with so little grasp of the truth that I think, before we go further, there's one thing that must be said very clearly: Canonical is not spying on you, full stop.

Canonical wants to know a little bit about the hardware its users run Ubuntu on. There's no spying, Canonical is asking and if you don't want to tell them you can easily opt out.

If you are willing to share a little bit of data about your hardware you'll be helping make Ubuntu better in the future. In order to best focus development efforts, which are limited, Canonical likes to know basic data about its users -- what size screen they use, which chip is in their laptop, how much RAM you have, which flavor of Ubuntu you're using, whether you enable Livepatch and a few other things. 

In the 18.04 welcome app there's a link to actually view the JSON file with the data you'd be sending to Canonical. It doesn't get any more transparent than that. In my initial review I noted that the whole opt-out UI was very well done, making it simple to see exactly what data Canonical was gathering. You can click to view the actual data sent. I don't know of any other data collection by a large company that offers that level of control. Firefox, which has been shipping with an opt-out data collection tool for many years, does not, so far as I know, give you access to the actual data file that's sent.

Unfortunately, I also fell victim to something that I think is rampant these days, especially among more tech-savvy users: paranoia about things that report, well, anything, to anyone. I also said that I had opted out of Canonical's data collection. I did that because I, like you, am paranoid about sending my data to anyone for any reason. I even had a wise crack about anonymization not working that my editor wisely removed, but that's where I am coming from, and I think that kind if deep-seated suspicion is common in our community.

Ever since Edward Snowden confirmed so many once outlandish conspiracy theories, the technically savvy audience of Linux users, those who care about their privacy, have been (understandably) ultra paranoid about any data collection. So when people hear that Ubuntu 18.04 is collecting data, it sounds bad. 

But the problem here is not Canonical, not data collection itself. In this case the data is very simple, totally anonymous (the server doesn't even record the IP it's sent from), and most importantly, clearly disclosed so that you can decide for yourself if you're comfortable sharing it (if you already installed and can't find your way to the JSON file, <a href="https://paste.ubuntu.com/p/xWxbbDGBfn/">here's a sample</a>). 

No, the problem here is actually much deeper and more difficult to solve. The problem, what generated the controversy, is the need for click-bait headlines on YouTube and elsewhere in this day and age of advertising-driven small publishers. Combine the deep-rooted, well-founded suspicion of the average Linux user with the current money making models of sites like YouTube, a platform where the faster you release a video the more hits, and therefore more money, you'll make, and you've pretty much got a recipe for needless controversy. Linux is hardly unique in falling victim to that.

YouTubers and unscrupulous journalists have become experts at manipulating our fears, playing off them to generate clicks that they turn into a few pennies while throwing Ubuntu, Canonical and the larger Linux community  -- that's you and I -- under the bus. 

Canonical makes an easy target for this sort of thing because it's the closest thing Linux has to a household name.

The fallout is that now half a dozen videos and articles litter the web, spreading, at best, misinformed half-truths and in most cases, outright lies (which, depending on where you live, could leave you open to slander lawsuits my dear enraged YouTuber) and incorrect technical solutions that may well screw up your installation should you be so foolish as to blinding type in terminal commands you find on the internet. Canonical's Martin Wimpress (who, among other things, is responsible for Ubuntu MATE) pointed out in recent episode of Jupiter Broadcasting's <a href="http://linuxunplugged.com/249">Linux Unplugged show</a>, that several of these videos claim the solution is to remove a package which, wait for it... has nothing to do with data collection.

The "solution" these uninformed, technically incompetent YouTubers and bloggers are pushing would be ridiculous even if they ever do figure out how to use dpkg-query to see which packages own which files. It would be ridiculous because if you uninstall rather than opt-out Canonical never knows you opted out and you've lost your chance to let Canonical know you didn't like the data collection. 

Predictably, these bloggers and YouTubers will be the first to complain when the next release of Ubuntu doesn't offer them anything new or helpful, and will never consider that perhaps their decision to not tell Canonical anything useful might be part of the reason behind that decision. At best perhaps they'll have pulled in enough YouTube pennies to finally move out of their parents' basements.

That's not real problem here though, the problem here is that in letting these unscrupulous writers create this tempest in a teapot we're scaring away other projects from doing the same sort of data collection. We're forcing developers to work in the dark and then complaining when we don't like the results. 

Take GNOME for instance, it's rather famous for removing features (it just removed the ability to launch apps from the file browser), perhaps, if GNOME started gathering some basic data on a larger scale about how people use GNOME the project would make different decisions. Small developers have an even harder time with this sort of thing and if they think they're going to have their projects labeled as "spyware" and angry YouTube videos posted they're never going to even try getting data. They're going to continue developing in the dark and all of us will suffer for it.

I'm not suggesting you should automatically opt-in to every bit of data collection every piece of software wants to do. There is a middle ground, there are some companies doing it right, some doing it wrong. I don't even know if you should opt in to Canonical's, that's something you need to decide for yourself. 

But you decide by reading the dialogs, looking at the actual data being sent, and considering the companies that want the data. Don't let ranting videos and articles playing off your fears make your decisions for you. Stop clicking on them at all.

In my case I went back and opted in to Ubuntu's data collection because I use Ubuntu (both Kubuntu and Ubuntu Server) and I want to help it get better.