WordPress Telemetry Proposal Addresses Long-Standing Privacy Concerns as GDPR Compliance Deadline Looms

At the end of October 2016, Morten Rand-Hendriksen created a proposal on WordPress trac for adding telemetry to core, an opt-in feature that would collect anonymized data on how people are using the software. He proposed that the new feature be displayed on first install or update, disabled by default in the admin with a control available under Settings->General. One option he suggests is shipping it as a plugin that auto-installs on opt-in and auto-uninstalls on opt-out. He also identified a few examples of core data that could be tracked, including number of themes and plugins installed, frequency of use of specific views (Settings, Customizer, etc), current version, update status, locale, and language.

“WordPress prides itself on being an application built by the user for the user,” Rand-Hendriksen said. “The problem is with the popularity and reach of WordPress today, the distance between the WordPress 1% (or even .1%) and the average user is becoming so vast we (the people who contribute to WordPress core) know almost nothing about the actual people who use WordPress or how they use the application.”

During the WordPress 4.7 development cycle, Rand-Hendriksen said he was involved in several conversations where participants assumed the use of features without any data to back up their opinions. He contends that WordPress contributors do not have the necessary data to know how users are interacting with the application and its features.

“The general argument was that based on the 80/20 rule, certain features should be added while others should be removed,” Rand-Hendriksen said. “I kept brining up the well known fact we don’t have a clue what features 80%, or even 20%, of WordPress users actually use so any claim of validity in the 80/20 rule is guesswork at best.”

His proposal states that all the data collected should be public for transparency and also made available to end-users in the admin and on WordPress.org.

The idea has had a few months to marinate and has generated some discussion about what a prototype would entail. Core committer Ella Van Dorpe created an experimental wp-data standalone plugin for tracking a few simple interactions with the editor. Participants in the discussion recommended creating an Elasticsearch/Logstash setup for storing the data, technologies that the WordPress.org systems team have deployed before.

“I think a good summary is that there are a lot of hurdles in the way and currently no one has time to work on it,” Greg Brown, a Data Wrangler at Automattic, said in a followup discussion on the ticket three weeks ago. “Ultimately, I think the biggest blocker is getting someone with the time, inclination, and persistence to work on this. Getting it deployed onto .org is the right thing to do eventually, but I suspect it will take quite a while.”

WordPress lead developer Dion Hulse confirmed that WordPress is already tracking many of these stats and that creating a prototype on WordPress.org infrastructure would be the best option forward.

“It would also be valuable to see how our existing stats system can compliment or be replaced by the proposal here though,” Hulse said. “I mention this as most of the stats from the original description are already tracked, just not exposed in any form. The only new thing mentioned here is the Frequency of use of specific views (Settings, Customizer, etc) and transparency part (which would still probably only be anonymized summaries, not exact data).”

WordPress Telemetry Project Provides a Solution to Long-Standing Privacy Concerns

Moving WordPress’ current data tracking into a more transparent opt-in feature would also provide a solution to some long-standing privacy concerns raised by contributors in a six-year-old trac ticket. WordPress tracks the number of blogs and users in a given installation, along with the installation URL in the headers, in order to facilitate update requests that may become problematic, particularly in the case of large multisite installations.

“Even if a user knows that some data needs to be passed for a version check of core, plugins, or themes, the amount of data passed to remote is obviously more than needed to do the version check,” one contributor commented on the ticket. “But users should be made aware upfront so they can freely decide on their own if they want to instead of being forced to support the project with their usage-data. They could be offered an opt-in to do so.”

“The number of registered users I have on my site tied to the URL that is sent with tracking request gives out vital information on how well my business could be doing – information that is mine and mine only,” WordPress plugin developer Danny van Kooten said. “At the very least we could make it very clear that WordPress is tracking this information and what exactly it is doing with it. I really do not think there is any excuse for that.”

Developers can filter the data to satisfy their privacy concerns but it is somewhat inextricable from the update process for larger multisite installations. It’s also too big of a technical hurdle for most regular users who would be better served by a simple UI allowing them to opt out of data collection.

Rand-Hendriksen’s WordPress telemetry proposal gives the project an opportunity to formalize what data is being collected, state the purpose behind it, and allow users to choose if they want to be included.

Europe’s General Data Protection Regulation (GDPR) May Push WordPress Towards More Transparent Data Collection

Progress on both the Telemetry project and the ticket regarding privacy concerns has been slow. Neither seem to be a priority among contributors, but Europe’s General Data Protection Regulation (GDPR) may provide the impetus needed to push WordPress towards more transparent and responsible data collection.

The GDPR is an overhaul of data protection law in Europe with far more stringent requirements than the previous laws. It requires full disclosure for any data collection and standardized privacy notices to help users understand where and how the data is being used. Consent to have data collected must be confirmed and users have the right to access their own data. It also includes the right of erasure or “the right to be forgotten,” which allows users to remove their data from the web. The GDPR goes into effect in May 2018.

Heather Burns, a digital law specialist who consults and speaks extensively on internet laws and policies, encouraged WordPress contributors to frame the discussion regarding privacy concerns in terms of working towards compliance with a specific framework.

“For the purposes of this discussion, core should work to the GDPR standard for two reasons,” Burns said. “The first reason lies in cultural differences. The US does not have a single overarching data protection and privacy regulation, unlike Europe, where we have this data protection regime which applies to all personal data regardless of use, format, or sector. So GDPR gives developers – even those outside the EU – a robust, healthy, and very tough set of standards to follow. Given what we have seen coming out of the White House in the past week, GDPR also provides as good a starting point as any for defensive user protection.

“The second is that GDPR is extraterritorial. It applies to the personal data of anyone in Europe regardless of where the online service is located. If your business is in the US or Australia or Israel but you have European users, you have to protect their data to European GDPR standards.”

Pricewaterhouse Coopers recently surveyed 200 US-based multinational companies with more than 500 employees and found that 77% plan to spend $1 million or more on GDPR compliance. More than half of those surveyed cited GDPR readiness as the highest priority on their data-privacy and security agendas.

The hefty penalties of noncompliance are one of the driving factors behind American companies spending millions of dollars on satisfying the requirements of this new European regulation.

“GDPR is a complete overhaul of its dialup-era (1995) predecessor and one of the areas that has been beefed up is its teeth,” Burns said. “Businesses which are found to be in noncompliance by a European member state’s data protection regulator, whether that is your small app studio all the way up to Automattic, could face penalties of up to 4% of the business’s global annual turnover. Now there’s some solid context for the philosophical discussion.”

However, not everyone is convinced that the GDPR will be beneficial to consumers. Kitty Kolding, CEO and president of Infocore Inc, an international company that specializes in sourcing market data, told ExchangeWire that she believes the GDPR will undermine “the sanctity of consumers’ data privacy and security” and hobble marketing and advertising worldwide.

She contends that provisions like the “right to be forgotten,” which require customer data to be retained beyond the time that it’s in active use, will make that data more susceptible to hacking. Additionally, the enforcement body for the new legislation claims authority over companies, with the right to search and seize records, without any oversight or appeals.

“Every company everywhere that handles data on EU citizens is also automatically subject to this group’s absolute power – though it’s anybody’s guess how the EU believes they can enforce such a broad mandate outside its own borders,” Kolding said.

Currently, only two trac tickets mention the GDPR so it’s not yet clear how WordPress core will respond to the requirements of the new legislation. Burns recommends that WordPress core contributors go through the process of conducting a privacy impact assessment to determine the right way forward.

Regardless of WordPress’ response, companies and organizations that depend on the software will need to assume the responsibility of their own compliance, as these requirements extend far beyond core. The GDPR applies to anything added into a website or app that collects users’ data. For example, many contact form plugins store submissions inside the WordPress database and site owners will want to re-examine how users are notified of this.

“One of the main changes with GDPR is called the accountability principle,” Burns said. “Businesses collecting personal data must be completely transparent and accountable over what data they are collecting, how they are storing it and where, who it is being passed to (such as third parties), who has access to it, and how long it is retained. Users also have the right to request that any data collected about them must be deleted.”

There’s no WordPress plugin that will instantly make a site GDPR compatible. Drupal has a GDPR module that aims to make sure the site follows the guidelines and legislation set by the EU, but it doesn’t cover all requirements. Automating an assessment of privacy impact for a site using a CMS and potentially dozens of third-party extensions is a complex endeavor. This is one regulation that will require business owners to educate themselves and implement privacy practices that put users’ interests first.

With the deadline for compliance closing in, WordPress has an opportunity to re-evaluate how the project handles user privacy and make steps towards greater transparency. If contributors are looking into collecting more data to assist decision-making on features, as outlined in Rand-Hendriksen’s telemetry proposal, this project provides an avenue for working towards GDPR compliance. These privacy concerns are especially important to address when considering WordPress for government, healthcare, educational institutes, and other data sensitive websites.

Burns views the GDPR’s compliance deadline as a fresh opportunity for WordPress to build better privacy structures and legal certainty using the regulation as a healthy baseline for all users.

“Everyone needs to be working in implementations for their own businesses and sites in any case ahead of deadline day, in addition to any changes that need to be made in the WP code,” Burns said. “It’s important to remember that GDPR compliance is not a tick box you can squeeze in next April. This is about your processes, your workflows, and your systems of accountability. Start now.”

16 Comments


  1. If it’s done correctly, I think it would improve the WordPress. Now its hard for WordPress developers to get a clear idea of their user and how they use it. Like WordPress is somewhat confused whether it’s only a blogging platform or website builder. I’m really curious to find out how many people selected “your latest posts” from Settings – Reading opposed selecting a static page as front page.

    Report


  2. The reasoning given for progress being slow on this is a good example of the challenge facing open source projects with no leadership structure or formalized long term planning. Without structure, important but “boring” projects are often left by the wayside in favor of shiny new things that are more fun to work on. More than my original reasoning, the arguments Heather brings to the table make it pretty clear something must be done about this issue, and that may require prioritizing data anonymity over shiny new Customizer features. With great power, and 27% market share, comes great responsibility.

    Report


  3. I agree to all Morten except that creating this feature is “boring”. From my personal view its extremely exciting to create a data collection module including its server infrastructure. Due to its complexity and the need to have access to the wordpress.org backend or at least the need to know the WP infrastructure makes it hard for the average wp contributor to work on this and it must be coordinate or done by a dev who is deeper involved.

    It can be started as a single plugin but can not be used by others giving them any beneficial feature. So noone will start with it until someone from the core team fires the gun and says: “Do it and we do everything to add it into core soon.” So yes, more leadership is needed.

    I am pretty sure if the direction is clear someone would like to work on this.

    The discussion is starting again. So let’s hope that someone will jump in.

    Report


  4. This is actually just the things that are easier to discuss, the main hurt points as far as privacy is concerned are the “out of the box” integration with akismet and gravatar. Akismet is not needed in core, disabling pingbacks and adding some captcha style fields to the form handle all the spam, and gravatar…, well if an integration with external service is a must (why must it be in core?), than obviously integration with facebook or google makes more sense.

    but obviously, anybody that understands the issue will not even bother opening tickets as there is no chance there can be a decision that is in the power of the core team to decide, and only @matt can make such decisions.

    Report


  5. To clarify Ms Kolding’s questions:

    Part of the accountability principle, for organisations engaging in large-scale data processing, is to appoint a position known as the data protection officer. This is an individual with responsibilities for monitoring compliance with GDPR. One of their duties must be to act as the point of contact for the supervisory authority, meaning the data protection regulator, in the member state where the company has noted its European presence. Identifying which European country a business is accountable in, and registering accordingly, is also a requirement under GDPR for businesses large enough to merit a DPO. If a European citizen is not happy about the way that, say, Infocore is processing their personal data, they will be able to register a concern with the data protection regulator in the member state where Infocore must note its registration.

    That data protection authority is not some jack-booted platoon of “absolute power”. (Really, “search and seize”? This isn’t CSI Brussels.) DPAs are rather geeky professionals in an office who will ask the company to demonstrate their accountability principles and documented compliance. If the regulator is actually rather nice, like the UK’s ICO, they will work constructively with the company to lead them to healthy compliance without imposing penalties or fines. On that matter, penalties and fines are only imposed after the company has exhausted all of the options presented to them by the regulator, has refused to work with them, or has committed a preventable data breach of absolutely monumental stupidity (BPAS comes to mind.)

    GDPR is not the EU “enforcing a broad mandate outside its borders.” GDPR means companies doing business in Europe, collecting data on European citizens *in its borders*, become accountable to those people via the data protection regulators in the countries where those people live. Companies which are not happy with that accountability do not have to do business in Europe.

    To frame GDPR as some sort of evil government monolith interfering with the soverignty of private businesses in other countries is to defend the abuses of privacy by the marketing and advertising industries which have largely inspired the data protection overhaul in the first place. As of 25 May 2018 those excuses will no longer wash.

    Report


    1. Could not agree more to Heather:

      “To frame GDPR as some sort of evil government monolith interfering with the soverignty of private businesses in other countries is to defend the abuses of privacy by the marketing and advertising industries which have largely inspired the data protection overhaul in the first place. As of 25 May 2018 those excuses will no longer wash.”

      and Morten also nails:

      “that may require prioritizing data anonymity over shiny new Customizer features. With great power, and 27% market share, comes great responsibility.”

      Report


    1. Like I explain in the ticket and in my original blog post on the topic, collecting user data like this is standard practice among software vendors because understanding how the end-user interacts with the application is vital for its longevity and to ensure human-centered design. Making an explicit choice not to collect such data is controversial and in my view problematic, especially for an application that claims to build solutions targeted at 80% of its users.

      Report


  6. We help plugin and theme developers to get those type of insights and usage tracking using Freemius Insights. Before Freemius, there was a designated tool for theme devs called PressTrends.

    I gave a talk at WordCamp Toronto about Next-Generation Data-Driven Plugin Development that discussed exactly that topic. Unfortunately, all the Toronto videos are still not on WordPress.tv.

    Our community is building products blindly, while other ecosystems like mobile apps track every user engagement to produce better products with greater UX. It’s about time we move forward.

    Report


  7. Totally agree that supporting GDPR is a good thing for WordPress, especially because so many organizations who have European constituents use WordPress.

    However, I was disheartened to read the following disingenuous assertion (at least related to US organizations) which the best I can determine is pro-GDPR spin and not the actual facts (emphasis mine):

    If your business is in the US or Australia or Israel but you have European users, you have to protect their data to European GDPR standards.

    I had a long Twitter discussion with Heather starting here and I did my best to get clarification but she did not seem to want to acknowledge that GDPR is currently only opt-in for US companies and that US companies are not otherwise legally required to follow GDPR.

    In general I am not against the GDPR and think that — all total — GDPR is a good idea. But I find it offensive when advocates of a policy (appear to do their best to) obfuscate facts and try to make people believe that their position is a legal requirement when in fact it is just an option.

    I know many people prefer “Alternate Facts” to be reality these days, but I still believe real facts matter.

    P.S. If you think following the GDPR should be a legal requirement for US organizations, then contact your Senators and Congresspeople and ask that they make it so. Until then, let’s stick to the facts.

    Report


    1. Sorry, Mike, you’re a great developer, but you’re not so hot as a lawyer!

      First, you confuse Privacy Shield (which is indeed opt-in) with the GDPR (which is not). In fact, Privacy Shield is supposed to be a vehicle for ensuring compliance with the GDPR. (Whether it actually does enough to achieve that is another question; its predecessor, Safe Harbor, did not). But the GDPR remains mandatory.

      Second, your points on Twitter didn’t get the definitive answer you wanted because you weren’t asking the right question. You kept asking about enforcement. But that’s not the same thing as whether the GDPR applies to US businesses with European users.

      Just because a motorist repeatedly exceeds the speed limit without getting caught doesn’t mean that there is no speed limit.

      In fact, there is no doubt that the GDPR does apply to US businesses with European users. That’s not a so-called “alternative fact,” but a real fact. (I do hope that every time someone doesn’t get an answer s/he wants, they don’t now resort to yelling “alternative fact.” That will get old really fast. And it’s quite misleading here.)

      There are many ways that the GDPR might be enforced against a US business, just as there are many ways that European governments might seek to collect VAT from US businesses that sell to customers in the EU. How they choose to do so is up to their imagination (not Heather’s).

      I know it’s nice to think that Americans are governed only by rules made somewhere within US boundaries. But, when they do something that affects people in other jurisdictions, the laws in those other jurisdictions often apply too.

      Report


      1. Sorry, Mike, you’re a great developer, but you’re not so hot as a lawyer!

        That is absolutely correct, I am not a great lawyer because I am not a lawyer! As an aside, I do appreciate the compliment on my development ability but I know that was not your reason to reply.

        That said, I am MORE than happy to learn that I was wrong on this. That is why I asked Heather so many questions. My sole interest here is to clarify the facts, in large part for my own needs but in small part to have a definitive place to point others on the subject since I know advocates for GDPR will debate the subject here until we arrive at the real facts of the matter.

        … But the GDPR remains mandatory.

        Second, your points on Twitter didn’t get the definitive answer you wanted because you weren’t asking the right question. You kept asking about enforcement. But that’s not the same thing as whether the GDPR applies to US businesses with European users.

        Just because a motorist repeatedly exceeds the speed limit without getting caught doesn’t mean that there is no speed limit. In fact, there is no doubt that the GDPR does apply to US businesses with European users.

        I probably worded my comment poorly.

        But based on what I currently understand I believe the ramifications are still the same. You and I are possibly debating the difference between what the law states and where the EU has jurisdiction; the latter of which is really what effectively matters. Let me explain by analogy.

        Law is based on precedent, at least in most of the US and (not Louisiana!) and most of Europe (I assume). In the USA (at least) there is a concept of "Nexus" related to need to collect sales tax on behalf of a US state.

        So if someone runs a business in the State of New York and sells consumer items then they need to collect sales taxes when they sell those items to New York state residents, even if they sell online. If they sell items retail via a store located in New York state then they must also collect sales tax for the State of New York anytime someone visits their shop and buys items, even if the buyers are not residents of New York (this differs from European VAT, from what I understand.)

        Now the State of California may decide to pass a law that anyone who is a California resident must pay California sales tax on anything they buy. But sales taxes are collected and enforced on the merchant, not the user, so this law effectively tells a New York-based company that they must collect sales tax on all of their online sales to residents of California. And that applies legally if the New York-based company has a store in California.

        But the problem for California is when this New York company does not have a Nexus in California where Nexus is defined as having a "physical presence." So if the New York company only has a physical presence in New York state then the State of California is left wanting.

        YES, this hypothetical California law is clear; according to this hypothetical the New York company must collect sales tax on California residents and then pay that tax to the State of California. But without Nexus, the State of California has no jurisdiction over the New York-based company thus the law is effectively moot. If the US Congress ever passed a federal law that required organizations to collect sales tax for all states then yes, a New York-based company would need to collect for California. But that has not (yet?) happened.

        It was with exactly this knowledge that I was suspect of the assertion that GDPR effectively applies to all organizations, even those outside of EU's borders. I ran a catalog mail order retailer in the states for over 10 years and while IANAL, we sure spent a lot of money for legal advice over those 10 years, so I do know a fair bit about about inter-jurisdictional law as long as precedent still holds.

        (emphasis mine):

        There are many ways that the GDPR might be enforced against a US business, just as there are many ways that European governments might seek to collect VAT from US businesses that sell to customers in the EU. How they choose to do so is up to their imagination

        Can you name just one way?

        In answering your reply I discovered that it was not googling "nexus" that revealed the answers so much as googling "jurisdiction." So it comes to this: does the EU have jurisdiction to hold organizations outside of the EU accountable to the GDPR law?

        YES the EU can claim it's law applies but can the EU actually enforce the law outside the EU? As a non-sensical analogy, I can assert you owe me money for my time because of the time it took me to reply to your comment here but does not actually mean you owe me money just because I asserted it? No, I have no authority over you. I can send you an invoice, but who would enforce my invoice on you? No one. At least no one using the law.

        Anyway, according to the American Bar Association on GDPR (emphasis mine):

        Data protection authorities will also be able to enforce penalties against the local representative of a non-EU data processor or controller, effectively giving those authorities indirect jurisdiction over non-EU data processors. (However) The GDPR has no means of enforcing penalties against non-EU processors who fail to appoint a local representative, which may lead some U.S. data processors to consider whether appointing a local representative simply invites more risk.

        And more on those "local representatives" (since, my questions really revolve around small and micro-businesses where compliance would comprise too large a part of their budget to allow the company to be viable, emphasis mine):

        Companies that only engage in “processing which is occasional, does not include, on a large scale, processing of special categories of data as referred to in Article 9(1) or processing of data relating to criminal convictions and offences referred to in Article 9a, and is unlikely to result in a risk for the rights and freedoms of individuals, taking into account the nature, context, scope and purposes of the processing” are not required to appoint local representatives.

        So even per the way the GDPR was written it appears that GDPR in fact does not apply to all non-EU businesses, which even exceeds the original point I was trying to make.

        Again, I don't have an issue with the GDPR per se. I have an issue when governments or organizations or people try to assert authority over others when they have no legal basis to assert that authority.

        Report


      2. Mike,

        The problem with your comment is that you are taking a concept from American law (especially that of nexus) and treating it as if it were a universal legal doctrine to be applied everywhere in the same way.

        Precedent, for example, is not a cornerstone of the law in most of continental Europe. It’s a device of common law systems, but is typically unknown in the civil law world. (There are exceptions to that general statement, but not many.)

        Nexus is typically the way that jurisdiction is established in the US. But it’s arguable even within the US in some areas of the law (e.g. bankruptcy). It’s certainly not a universal test.

        In fact, the US is just like the EU in that, in many instances, it asserts jurisdiction (even over foreign entities) when the interests of its citizens are at stake. That’s precisely what the EU is doing here.

        There doesn’t need to be a “local representative” for enforcement action. For example, assets held in the EU that are traceable to the relevant US person or entity (e.g. a vacation home) can still be seized.

        Obviously, it is unlikely that the EU or a Member State would seek to take action against a small entity for a minor breach when that entity is outside the EU. Everyone has priorities, and this would not be high on the EU’s list.

        Rather than denying jurisdiction on grounds irrelevant to EU law, what I would suggest is that people in the US should so what they should do with regard to any business risk. Carry out a proper risk assessment and then decide what your best course of action should be.

        Report


      3. Coming back to this rather late but –

        As Tim has said, Privacy Shield is indeed opt-in. GDPR is not.

        If a business’s basis for noncompliance is based on an ideological objection to what they perceive as an overreach of sovereignty rather than the ethics of privacy and data protection, which I believe is called being “disruptive” these days, their option is simple: don’t do business with non-US customers.

        ICO is running a webinar on Thursday if you would like to get answers from the data protection regulator’s mouth. https://twitter.com/ICOnews/status/828574967927148544

        In a wider sense, on behalf of approximately ten million people, read up on FATCA and then tell me how you feel about non-US nations trying to “assert authority over others when they have no legal basis to assert that authority” when the US has made millions of non-US people’s lives hell, and forced non-US businesses to spend billions of pounds in compliance costs, doing exactly that. Pot kettle.

        Report


      4. @Heather

        I find it only sightly amusing that someone who has a lot of skin in the game is getting behind this crock. The EU has exactly ZERO jurisdiction or authority to enforce this. Period.

        I guess it may help to increase your speaking engagements and consulting work but it is 100% toothless.

        Jumping up and down and being condescending to others here does NOT make it so. Sounds more like someone trying to get their 15 minutes of fame in a very obscure area of the law.

        Lucky me, I don’t own any vacation properties in the EU.

        Perhaps the EU should be worried about more relevant things – like becoming irrelevant. Brexit FTW!

        Report


    2. LOL how did israel got to be mentioned in this context?

      But cross border internet law is a tricky thing, and I am not sure about european law but I do know how israeli law is being applied. In essence if you are targeting israely audience (lets say by having a web site in hebrew) you might get sued in israel for whatever your site does which is against the israeli law, and it does not make much difference where is the site hosted or who is the company behind it. The cases I can immediately remember are about gambling sites, and slander.

      Obviously israel is a small country and if you are not a citizen there is not much the israeli law can do to you, but the EU is somewhat bigger than the US and breaking its laws might have ramifications for US companies even if they just provide a german/french/italian localization of their sites.

      As microsoft, google and facebook discovered by now, if you want to operate in the EU you need to pay attention to what the regulators tell you.

      So for sure if you have a site for some local shop in the middle of nowhere in Minnesota, you probably don’t need to care about the GDPR, but once you start selling stuff to europeans, you are stepping into a gray area.

      Report

Comments are closed.