In this blog, I have repeatedly discussed the subject of online privacy. Either by commenting on the current affairs of peer-to-peer networks like torrentspy or talking about the future of privacy regarding the data openness we are lately experiencing.
Yesterday, I visited one of my favorite annual exhibitions in Berlin, the transmediale, whose topic this time was “conspire”. Among other things, I attended a conference called “Web 3.0: Conspiring to keep the Net Public”, with the hope of discussing the evolution of the upcoming semantic web. To my surprise, the talk concentrated mostly on the privacy prospect of the web. To be honest, the overall conference didn’t blow my mind (it was hard to follow), but the presentation of Seda Gürses was a pleasant exception.
She pointed out some very interesting insights on privacy in cyberspace, which I would like to discuss here.

So what is privacy?

In her presentation, Seda showed a mathematical formula of privacy, which says that:

privacy = the right to be left alone / concealment of data x k-anonymity.

This means, that privacy consists of our fundamental need and right to be left alone, which can be achieved by concealment of data and k-anonymity. Lets get a bit more specific with the terms.

Concealment of Data

Whenever you subscribe in a site, there is always a login form with asterisks next to the fields you must fill in (your mail, your age, your zip code, etc); and there is always this little box you must click called “I have read the Terms of Service and agree with the policy”. Now if the service is a commercial one, it may provide these information to the so called ‘data-miners’.
They are marketing people, who collect vast amounts of information and then plan a corresponding marketing pattern.

They say for example: 50% of the Facebook users who have installed the vampire application are buying Dungeons and Dragons books in Amazon. And they put an ad next to the vampires applications about D&D.
Data mining vs. Privacy is an important issue covering not only the online world but also political subjects.

But it’s not, that there is no solution. Bruce Schneier noted:

there are many ways to analyze data without knowing details of the data, [...] it’s just that there is little incentive to use them.

Concealment of data suggests, that information such as name, age, location, etc. remain private. But how can this achieved?

K-Anonymity

That’s where k-anonymity comes handy. It keeps data miners and privacy advocates satisfied. K-anonymity simply says, that

A release provides k-anonymity protection if the information for each person contained in the release cannot be distinguished from at least k-1 individuals whose information also appears in the release.

K-anonymity can be achieved by two methods:

  1. Generalization.
    Instead of saying: this subject is 26 years old, you say it is 20-30 years old. Instead of saying he lives in 10247 Berlin, you say 10xxx Berlin. And so on.
  2. Another interesting way is perturbing the data. This means, that

The actual value can be replaced with a random value out of the standard distribution of values for that field. In this way, the overall distribution of values for that field will remain the same, but the individual data values will be wrong.

In other words, you can change the individual data in such way, that the collective data will still remain the same.

The right to be left alone

I left this one in the end on purpose. We all take this right for granted and in a way it is for granted. But if you think about it, its boundaries are very flexible. The issue of privacy is not only about concealing data, but also about the negotiation of what is private and what not. Years ago it was a debate if domestic violence was a privacy issue or not.

The best question ever

Seda Gürses stated in interesting theory (with a cute video), which concluded with the best question ever. It is a theory of a swedish scientist, whose name she didn’t remember (sadly).

If we really want to stay private and anonymous, concealing our personal information is surely not adequate enough. There are many parameters, which distinguish us from the others.

True and absolute anonymity can be only achieved when:

  • Everyone would wear an identical box, which should be so wide and tall as the widest and tallest person on earth, so that our external characteristics wouldn’t be possible.
  • Everyone would walk with the same pace, so that walking differentiation wouldn’t be possible.
  • Everyone would go out of his house at the same time, so that noone could identify another.
  • Each time someone went out, he should take a different route, so that a categorization would be impossible. etc. etc.

Also to avoid loneliness and isolation, people would be allowed to have a pet.

So in a world of true anonymity, the only distinction from one person to another would be his pet.
The question is: do we really want to live in a world of true anonymity?

Popularity: 13% [?]

  • Digg
  • del.icio.us
  • Sphinn
  • StumbleUpon
  • Mixx