AOL released the data of over 600.000 anonymous users and over 19 million searches from March and May, a total of 2GB uncompressed, it's all over the internet now.
For some reason they didn't realize that the search data contains personally identifiable information, it is people who searched for their name, SSN, credit card number and other things on the internet.
This "screw up" (that's how AOL called it themselves) is believed to be a gigantic privacy violation. This big dataset is a gold mine for data mining for both legal and less legal means.
An example, the ctr for the search results.
http://technology.guardian.co.uk/news/story/0,,1839859,00.html
http://www.itwire.com.au/content/view/5236/106/
http://www.informationweek.com/security/showArticle.jhtml?articleID=191900569&subSection=Viruses+and+Patches
I might be a bit biased against AOL (they block our emails), but I mean ... how low can you go?
For some reason they didn't realize that the search data contains personally identifiable information, it is people who searched for their name, SSN, credit card number and other things on the internet.
This "screw up" (that's how AOL called it themselves) is believed to be a gigantic privacy violation. This big dataset is a gold mine for data mining for both legal and less legal means.
An example, the ctr for the search results.
| Quote: |
| 1 - 22.73%
2 - 6.40% 3 - 4.53% 4 - 3.24% 5 - 2.61% 6 - 2.14% 7 - 1.81% 8 - 1.60% 9 - 1.51% 10 - 1.59% 11 - 0.35% 12 - 0.30% 13 - 0.28% 14 - 0.26% 15 - 0.25% 16 - 0.21% 17 - 0.19% 18 - 0.18% 19 - 0.17% 20 - 0.16% |
http://technology.guardian.co.uk/news/story/0,,1839859,00.html
http://www.itwire.com.au/content/view/5236/106/
http://www.informationweek.com/security/showArticle.jhtml?articleID=191900569&subSection=Viruses+and+Patches
I might be a bit biased against AOL (they block our emails), but I mean ... how low can you go?
