With a report yesterday highlighting how many government IT systems are fundamentally insecure it's perhaps a good time to reconsider just how safe that "anonymous" data organisations are keeping on you is. An interesting article in the Guardian shows how we shouldn't trust organisations who state that they will only keep "anonymised" data (i.e., all identifying information is removed) or "pseudonymised" data (i.e., all identifiers are replaced with pseudonyms). It turns out that if you can match features in two data sets, one of which has been anonymised and one of which hasn't, it's typically very easy to reveal the true identities of the anonymised data.
Consider this example from a few months ago; the UK police revealed that an 83 year old male Australian entertainer living in the UK was "helping them with their enquiries into sexual offenses." Now the police had followed standard practice in protecting the identity of the man and their press release was anonymous. Except for the obvious fact that Rolf Harris was the only 83 year old male Australian entertainer living in the UK and thus it took the lazy-web about 2 seconds to figure that out. Obviously this is an extreme example but you see the principle. Next time an organisation tells you "it only keeps anonymous data" think again.