Why the world needs Cliqz
Why is it important and how is it possible to change the way data is collected from users? The tech community found answers to these and other questions at a Meetup in Munich.
Last Thursday, we invited the tech community to a Meetup with the motto “Why the world needs Cliqz” to our Munich offices. In a relaxed atmosphere the visitors could inform themselves about the products, projects and philosophy of Cliqz. The focus was on technologies we are using in the Cliqz products, why we are doing this and how we are doing this from a technical perspective. Over delicious food and cool drinks, there was also ample opportunity to exchange ideas.
Did you miss the event or would you like to see all the talks again? No problem! You can find a recording of the livestream on our Facebook page.
In his introductory talk, Jean-Paul Schmetz, founder and CEO of Cliqz, first looked back to the days when the World Wide Web was still “open, anonymous and – in a good way – ugly”. Over the years there have been various dominant browsers, ranging from Mosaic to Netscape, Internet Explorer, Firefox and Chrome. So why should it not be possible to overcome Chrome’s current dominant position?
Google owes this position primarily to the fact that they were the first to realize the enormous business potential behind search and that a browser is required because it is the main distribution channel for search. To protect their search from outside influences, Google built Chrome. Basically, all of their services and products serve only one purpose: to defend their search and thus their main revenue stream.
When we at Cliqz started developing a search engine, we also quickly discovered that a search needs a browser in order to avoid being dependent on partners. This is why Cliqz quickly became a successful Firefox extension, a fork of Firefox and a mobile browser for Android and iOS. We also knew that we needed web traffic data as seed capital to build a search index. “However, we were surprised and shocked that everything about everyone is on sale, when you decide to spend a meaningful amount of money to buy data,” says Schmetz.
This horrifying discovery led to the decision to equip the Cliqz Browser with an AI-based anti-tracking technology and to block all extensions to protect our users’ data and privacy. With our philosophy of first thinking of the user in everything we do, Cliqz is completely different from the companies that dominate the Internet today and design the web according to their interests and those of the advertising industry.
How powerful corporations like Google, Facebook, Apple, Amazon and Microsoft have become, was illustrated by Cliqz product manager Oleksandra Karpovych at the beginning of her presentation “Designing a different kind of ecosystem”. At $3.76 trillion, the market capitalization of the “Big Five” now exceeds Germany’s gross domestic product of $3.68 trillion. Only the United States, China and Japan have a higher GDP. The search market has a total value of 92.4 billion dollars, of which Google alone accounts for around 90 percent. Google achieved such a dominant position within the market by tying their users to their ecosystem with supposedly free services. In the end, however, users pay with their private data, which is worth a lot of money to corporations because the data is used to create detailed profiles for advertising purposes.
At Cliqz we of course also want to earn money, but in an ethical way. We have developed a business model that respects the privacy of our users. Unlike Google and others, Cliqz does not store user data centralized on servers in profiles. Instead, we completely moved the data aggregation to the client side. This means that all personal data always remains locally on the device and under the control of the user. We only store anonymous statistics on our servers, which cannot be used to identify individuals or create profiles.
This approach is the foundation of all Cliqz products and features (e.g. our anonymous quick search engine) as well as our GDPR-compliant business model MyOffrz. The latter displays relevant offers to the user directly in the browser. It is the first service that brings together interest-based targeting with consistent data privacy and protection.
Protecting our users’ privacy is also at the heart of Cliqz’s and Ghostery’s anti-tracking technology that prevents third-party trackers from tracking users across the web to monitor their browsing habits. In his presentation “Data-Driven Anti-tracking”, Sam Macbeth, software engineer at Cliqz Privacy Team, provided a deeper technical insight into tracking methods and various anti-tracking measures.
Simple countermeasures such as blocking third-party cookies often don’t lead to the desired success and also cause site breakage. Therefore, Cliqz Anti-Tracking follows a heuristic approach. It filters out data values that allow an identification of the user, overwrites them with a generic placeholder, and sends this information back to the trackers. This method has the advantage of reducing site breakage and working independently of the tracking method used. It also protects from fingerprinting and new tracking methods. You can find detailed information about Cliqz’s anti-tracking technology in a Techblog article.
With the website WhoTracks.me Cliqz and Ghostery offer a transparency tool for online-tracking on the web. It provides structured information on tracking technologies, market structure and data-sharing on the web. There you will find answers to questions such as which websites spy on you the most, which trackers are the most common, and which companies are behind them. The WhoTracks.me data is freely available under the Creative Commons license and includes statistics on over 1000 trackers, 1800 websites and 1700 tracker domains.
Our software engineers Alexander Komarnitskiy and Naira Sahakyan talked about the peculiarities of the Cliqz Browsers, which are available for free download for Windows, Mac, Android and iOS. All versions are based on open source browser technologies: Like the desktop versions, the iOS app is based on Firefox. The same will apply to the Android version, which currently uses Lightning as a technical basis. The software code of all Cliqz applications is also open source and publicly accessible on GitHub.
For added value, the Cliqz Browsers offer various privacy and security functions (anti-tracking, anti-phishing, HTTPS Everywhere) as well as some useful additional features. The latter include an adblocker, a video downloader, an automatic forget mode (incognito mode), the P2P-based synchronization feature Connect as well as a new tab page with direct links to most visited or favored websites and curated news. Of course, Cliqz’s anonymous quick search engine is also built-in by default. Once you start typing a search query or website address, quick search displays instant search results right below the entered query. On mobile devices, results are displayed in real time on intelligently designed cards, saving you time and data volume: simply type a query, select a suggested website or swipe left for more results.
Last but not least, Alexandra Konrad, software engineer at the Cliqz Search Core Team, gave an overview on the functionality and the technical basis of the Cliqz search engine. The highest premise in the development of the independent search engine was and is to respect the users and their privacy and thus to do without tracking. This approach differs fundamentally from that of today’s dominant search engines. Google, above all, collects vast amounts of user data via their search and other services, which they merge and process into detailed profiles. Based on these user profiles stored on servers, they display personalized results and search advertising.
Cliqz, on the other hand, does not use such profiles stored on servers, as they are simply not necessary to present the most relevant, context-sensitive results to users. This is because the sequence of results can be optimized simply by applying simple statistics and a lot of additional ranking factors. To put it very simply: The URL with the most hits is the best result for a certain search query. Even a selection of website suggestions tailored to individual users is possible. And it can all be done on the client side. This approach allows personalized results while respecting privacy.
Instead of only manually adopted algorithms, Cliqz search increasingly uses data learning techniques to search through the compiled index with a size of approximately 6 TByte for the most suitable results for a query. In 90 percent of the cases, the results are available within 100 milliseconds. The search backend currently runs on three servers in parallel and can answer up to 200 queries per second. The system is highly scalable: up to 1500 requests per second can be processed on seven servers.
The Cliqz quick search always displays the three most relevant results. As this is not sufficient for some search queries, we are currently working on a conventional search engine results page (SERP) with more results. Also in development is a unified index that supports multiple languages to provide the most appropriate results for each country.
Where data is collected in vast amounts and centralized on servers, there are always (unwanted) side effects with a negative impact on privacy. In times of the Internet of Things this is a huge problem. Sensors and processors are already in all kinds of everyday objects, e.g. cars, refrigerators or clothing. It cannot be that all these minicomputers simply communicate back home as trackers do on almost every web page.
“If we cannot fix this problem in the browser, there is little chance that we will fix it in the real world. And this would have horrifying consequences,” says Cliqz founder Jean-Paul Schmetz. “It is necessary to try to fix the problems in the browser (call it OS if you are in the mobile world) because the solution will most likely be the same in the IOT world.” But don’t expect these solutions to come from the companies who benefit most from the problem (like Google and Facebook), nor from the people who depend on them for revenue. Cliqz’s solution approach is the client-side data aggregation, where all personal data remains on the device, owned and controlled by the user.