Search engines should not hold on to personal data at the end of six months due to privacy concerns, the European Commission's data protection watchdog has recommended in a report.
In a draft document issued following an extensive inquiry into data retention, the commission's advisory body on data protection said: "Search engine providers must delete or irreversibly anonymise personal data once they no longer serve the specified and legitimate purpose they were collected for."
"The consent of the user must be sought for all planned cross-relation of user data, user profile enrichment exercises," the document further reads, noting that collectors have "insufficiently explained" to their users what they are retaining data for.
Furthermore, the report calls for the consent of the user to be sought "for all planned cross-relation of user data and user profile enrichment exercises"—in other words, joining up bits of user data to deliver other services or to develop a profile of the web-surfer.
The group was established to advise the European Commission and make recommendations on data protection. It is unusual for the commission to not heed its advice—although any attempt to do so is set to be sharply opposed by the major search engine providers, such as Google and Yahoo, who retain such data for up to 18 months.
Search engines collect information from every search made using their service, as well as the address of the computer—the 'IP address'—that has made a particular search. This combined data, or search history, is a rich mine of user information, which can be tracked using a parcel of text called a 'cookie' and is sometimes combined with other data from third parties.
Cookies deployed by search engines generally contain information about the user's operating system and browser, and a unique identification number for each user account, permitting a more accurate identification of the user than the IP address alone.
The quality of service
Google, the Internet's dominant search engine, quickly responded to the recommendation on its public policy blog, arguing that data retention allows it to improve search results and prevent fraud.
"We believe that data retention requirements have to take into account the need to provide quality products and services for users, such as accurate search results, as well as system security and integrity concerns," reads a blog post, filed by Peter Fleischer, Google's global privacy counsel.
"We have recently discussed some of the many ways that using this data helps improve users' experience, from making our products safe, to preventing fraud, to building language models to improve search results.
The US-based Electronic Privacy Information Centre, which urged the European Parliament in January to protect the privacy of search histories, welcomed the report, in particular that the "opinion further holds that European privacy laws generally apply to search engines 'even when their headquarters are outside [Europe]'"
Daniel Brandt, a 54-year-old webmaster from San Antonio in the United States and one of the company's biggest privacy critics, discovered in 2002 that Google's cookies had an expiration date of 2038. In 2007, the company announced that they would now expire after two years, although the cookie is renewed every time a user uses a Google service.
Mr Brandt was also positive about the announcement: "The EU is way ahead of the US in terms of data retention and privacy," he told the EUobserver.
"This is a move in the right direction, but even six months is too long to retain user data. Thirty days should be sufficient for any service, even targeted advertising."
Vast amounts of information
The amount of information search engines know about us is vast. In 2006, AOL, an internet service provider, accidentally published a sample of queries and results of some 650,000 users over a three-month period.
Although AOL had replaced the names of the users by a number, journalists found out these results could often be traced to individual users, not only because of so-called vanity searches (people searching for information about themselves) but also by combining several queries from a single user .
Mr Brandt worries that such valuable data is simply too important or too profitable not to sell it or hand it over to state agencies such as the police or security services.
"Search engines should be treated like any other public utility such as the telephone company or electric company—highly regulated."
Ultimately, Mr Brandt believes that like libraries, search engines should be publicly run.
"Public libraries—and Google is the modern equivalent—don't keep personal data based on all the books you borrow and then sell it to advertisers."