Human Web is a software solution integrated in the browser or browser extensions of Cliqz and Ghostery. The objective of the software is to develop statistics that we can use as fuel for our products. We use the collectively compiled statistics to find out which websites are the best match for search queries and to identify bogus websites and trackers. Human Web users thus make the Internet as a community a better place. Participation in the Human Web is voluntary. The Privacy-by-Design architecture of the Human Web technology ensures that the statistics contain no data about individual users. The anonymity of the user is always guaranteed fully and the possibility of tracking is excluded. This can be verified since the Human Web’s software code is public (open source).

Conventional search engines follow a purely technical approach, in that they mainly evaluate data in relation to contents, structuring and networking of websites.

Cliqz on the other hand taps into the wisdom of the crowd with Human Web. Our search engine works with strictly anonymous statistical, collectively gathered data on search queries and pages viewed in order to classify the relevance of websites. Measurements are also taken at a statistical level as to the total number of clicks achieved by the website recommendations in the Cliqz quick search. This allows us to generate hit rankings.

The Human Web statistics contain no data that could be used to identify individual users or devices. The data is not only strictly anonymous, rather is also recorded in a way that prevents de-anonymization. This therefore guarantees that the Human Web never reveals anything about the web searches and website visits of individuals. The possibility of tracking is thus strictly excluded.

The user’s anonymity always remains assured

To ensure the complete anonymity of the individual user at all times, the Human Web is designed such that no conclusions can be drawn concerning the individuals by linking different data points. Two core components are responsible for ensuring this, namely a structured framework for collecting data and a proxy network.

The former ensures that all data points added by users are only evaluated as an individual, combined event. This means that it is impossible to link data from several search queries or multiple page visits. This information also cannot be combined with any of the user’s personal data, such as e-mail address for example. To protect your privacy, the website visitor statistics are always kept strictly separate from statistical data on search queries. We do not save session IDs or to-the-second or -minute timestamps. We also do not record any information from visited pages that require any form of logon. Our Human Web technology automatically filters out all confidential or personal data from URLs that would allow individuals to be identified (e.g. twitter.com/user name). It uses different heuristic procedures as well as procedures based on machine learning to do this. From the outset, therefore, we avoid such information ending up on our servers at all.

Anonymize Human Web
Only data from URLs that contain the same content whether logged in or not is regarded as “public” and sent to the Human Web. URL portions that could contain personal information are removed prior to transmission.

Human Web data is transmitted in fully encrypted form via a proxy network. This ensures that we know nothing about the user when the data reaches us, since the proxy network removes the user’s individual IP address. We simply receive the IP address of the proxy network and cannot derive users from this. The actual proxies are not capable of reading the encrypted information or finding out anything about it. Sender and content are therefore kept completely separate. This makes it impossible both for us and for third parties to connect users and user data in any way. Subsequent de-anonymization is excluded owing to the way we record and save data.

Cliqz Proxy Network Flow
The IP address is concealed in the proxy network and Cliqz only sees data like search queries without any personal reference.

Optimum transparency

As with all functions integrated in the Cliqz browser, the Human Web is also open source. This means that anyone can view the software code on the client side on GitHub and check it themselves assuming the relevant technical expertise. Our software, infrastructure as well as methods of collecting data are also regularly checked both internally and externally. External assessments have included participation by TÜV Saarland, Mozilla, researchers at Princeton University and RedTeam Pentesting among others. In addition, we have installed a transparency cockpit in our browser that provides a real-time overview of the data transmitted from your device to Cliqz.

When you participate in the Human Web, you not only help us generate better website recommendations in the Cliqz search but also contribute to a safer Internet in general. Participation is and remains voluntary however. Should you decide against transmitting anonymous statistics about your searches and website visits, you can unsubscribe from the Human Web at any time. To do this, open the Control Center in the browser (Q menu to the right of the URL bar) and set the option for “Human Web” under “Search Options” to “Disabled”. You will find the option under “Human Web” in the browser settings on mobile devices.

Some web services, such as a search engine, are reliant on very large volumes of data to operate reliably. Big data and data protection are not necessarily insurmountable opposites however. What is critical is how personal data is handled. In contrast to most other Internet companies, Cliqz restricts itself exclusively to anonymous statistical data with absolutely no personal reference. The Human Web proves that complex systems such as a web search can be constructed and operated without compromising the privacy of users.

Together towards a safer Internet

Human Web technology was initially developed for indexing the Internet for the Cliqz quick search engine. Today, Human Web is also used to obtain statistics on usage that allow us to identify trackers and phishing websites. Users who contribute collectively and completely anonymously to the statistics are therefore helping to protect everyone’s privacy and ultimately making the Internet safer for all of us.