Facebook Doesn't Really Know Me at All
The Cambridge Analytica scandal, which has cost Facebook some $58 billion in market capitalization compared with the start of the year, is like a nesting doll. The British firm's ploy to obtain Facebook information is the smallest figurine. It's important for the 21st century citizen to understand the other layers too.
The CA figurine is enclosed in a bigger one: the data collected about us by Facebook. The next layer is all the data Big Tech has about people. Whether or not we're paranoid enough to worry about the existence of another layer -- the surveillance machine that's being built by the corporate world, politicians and governments using these data along with those contained in government databases -- we have the right to know what's being collected, how and why.
Superficially, big tech companies appear to be open about their data collection practices. One can easily download an archive from Facebook, Google and Twitter. Amazon doesn't offer such an opportunity, but, if you're patient enough, it'll send you the data, and it tells you in general what types of information it collects and shares. There are settings to turn off various data-harvesting services and ways to wipe some of the data. But if you get the archives, as I have, you may be surprised to meet the person who's supposed to be your digital doppelganger and intrigued by what the tech companies might be doing with him or her.
Meeting the doppelganger will take some perseverance. Among the more than 3 gigabytes of data Google sent me (yes, that little: I kept Gmail and Google Photos out of my request and I'd recently wiped the search and browsing histories from my Google account), there was a record of my movements -- a location history. Entries in it look like this:
The timestamp is in UNIX time, and it can easily be converted into a normal date. Google knows where I've been since Sunday, Feb. 24, 2013. One needs to divide the coordinates by 10 million to get to the accepted format, which can be entered onto a map. The ones in my Google file point to the village of Malaya Vishera in Russia's Novgorod region. I have never been there.
Unlike Google, Facebook provides a convenient interface to view the data files. With enough patience, one can discover all the Messenger conversation histories that you haven't specifically erased -- some with people you have blocked or forgotten; there'll also be every ad you've recently clicked on and all the likes and comments you're ashamed of. But, perhaps most intriguingly -- because it's the data closest to the heart of Facebook's business model, based on micro-targeting commercial messages to you -- you'll see ad topics with which the system identifies you. Mine are as follows:
- Nana (manga)
- The New York Times
- Nassim Nicholas Taleb
- Amnesty International
- Angela Merkel
- Funny Images
I'm bothered by the idea that Big Tech thinks of me as a Buddhist who sits in Malaya Vishera and looks at mangas (who on earth is Nana, anyway? I fear that googling her may reinforce Big Tech's impression that I'm a fan).
Then again, it may go with my psychological profile as compiled by the likes of Cambridge Analytica. The profile I get from the University of Cambridge Psychometrics Center, the origin of the research that informed CA's methods, says, based on my Facebook and Twitter feeds, that I'm 27 years old. I'm actually 46, so I won't argue childishly with the psychological traits ascribed to me by the profile. Perhaps some of them fit, and perhaps they fit the manga-loving Taleb fan from the Novgorod region -- surely they fit someone.
I do give up a lot of data to the digital giants. Some of it is fake, some real, some is distorted by the devices used to harvest the data (I suspect the Malaya Vishera bit falls into the latter category) and some are ruined by joint use. My younger daughter, for example, buys e-books on Amazon more frequently than I do, because her books are shorter, and that skews the ad targeting considerably: Amazon seems to think I'm eight.
What I really want to know, however, is how exactly the tech firms use what I give them.
Google says it uses the location data, among other things, to give me traffic information about my regular commute -- and it does so faithfully every day as if I drove to work and back; I don't, I use public transportation. But I also know that if my wife were the jealous kind, she might someday demand to see specific Google location data for a certain day -- and I'd struggle to explain what she'd find.
It's not enough for the tech firms to give users access to the collected raw data, which most users will struggle to interpret, anyway. (These data dumps are so inconveniently structured that one gets the feeling they are designed to obfuscate). What I'd like is, for every type of data, a full disclosure of the purposes for which it is processed and the conclusions drawn from it by algorithms.
Essentially, I'd like to be able to push a button next to a "targeted" ad -- or to any algorithmic suggestion of an interaction -- that would immediately reveal exactly what information I provided to make the algorithm decide I must see it. Then, I'd like to be able to erase that information immediately. I'd also like, with my data dump, a detailed explanation of what the company does with each data type that it harvests and stores. I just can't think of a reason why Google would need five years of my location data, for example.
Controlling my information ultimately means just that: I want to be the editor of my data as provided to business entities, and I want much better tools for that job than the ones provided today. Google, Amazon, Facebook and Twitter know how to design a friendly user interface; they just haven't done it here.
This may have been wishful thinking a year ago. But thanks to the CA scandal, the tide is turning and regulators are getting fidgety. Facebook announced today that they are giving users more control over their privacy through more accessible settings, which will no longer be scattered through 20 or so different pages. That's a baby step. If Facebook and other data harvesters don't go much further, there's now a real chance they will be forced to.
To contact the editor responsible for this story:
Therese Raphael at firstname.lastname@example.org