User Tracking to be Removed from Gutenberg in Upcoming 0.8.0 Release

photo credit: Saving Chicago CPR

The opt-in user tracking that was added to Gutenberg 0.7.0 will be pulled from the plugin in the upcoming 0.8.0 release. The data collection included in last week’s release reignited the discussion regarding adding telemetry to WordPress.

James Nylen and an Automattic engineers involved in Gutenberg added the feature with the goal of improving the editor based on usage patterns. Nylen said the approach they used was very similar to Calypso’s event tracking code and that it would provide “a very useful technique to collect user experience data.” They had planned to use the data to inform various decisions, such as default order for blocks and whether some blocks are less suitable for core. Gutenberg contributors were looking into making the tracking its own module so it could be useful for other WP feature plugins and core.

Shortly after the feature was added to Gutenberg, contributors began to revisit the Telemetry discussion on WordPress Trac. The topic of telemetry for core had been tabled earlier this year, as it did not fall within the three core focus areas for WordPress development in 2017. Participants requested the ticket be reopened for discussion looking toward 2018 in light of Gutenberg adding opt-in tracking.

“I think it’s a terrible idea for Gutenberg, too,” Matt Mullenweg commented on the ticket. “I doubt that anything actionable or useful will come of it that couldn’t be obtained by non-data-collecting means.”

Twelve hours later, James Nylen commented on his original announcement to notify the community that tracking will be removed from Gutenberg in the 0.8.0 release:

There’s been quite a lot of discussion on this topic across the community, much of which stems from earlier discussions like #38418, which I wasn’t aware of.

Usage tracking in Core and feature projects is a much bigger topic than fits into the scope of Gutenberg right now, so I’ve removed it from the GitHub repo, and it will be removed in the 0.8 Gutenberg release.

The data that it was tracking, while interesting, probably wouldn’t have been a significant factor in the long-term growth and development of Gutenberg. The discussion surrounding the data collection, however, would take up a disproportionate amount of the team’s time.

Nylen said the data collected by the plugin thus far will be deleted after 0.8 rolls out and that since it’s so early in Gutenberg’s development there was “not enough data collected to provide any sort of picture of usage.”

WordPress Telemetry Advocates Continue Lobbying for Opt-In Data Collection

The discussion about whether or not WordPress needs telemetry has continued in the form of tweetstorms, as data collection advocates make the case for data-driven decision making.

“The decision not to capture metrics (telemetry) from WordPress is one that continues to have a large impact on what we (don’t) know,” Liquid Web VP of Product Chris Lema said. “As we’re trying to make decisions about Gutenberg and metaboxes, we might ask, how big a problem is this, by number of plugins or sites. But we don’t know because we decided that we can always iterate WordPress, like we’ve always done. It’s true that we’ve done that before, but that doesn’t mean it’s either the wisest approach, nor the least risky. With so many options today, will people necessarily return? The more logical approach, in my mind, is to capture as much data as possible and to make it as public as possible, so we can all review.”

WordPress Telemetry proposal author Morten Rand-Hendriksen joined in the discussion with another tweetstorm:

WordPress needs a core method for collecting quantitative user data through telemetry (metrics). One of the biggest challenges WordPress faces is the lack of reliable data about global day-to-day use. Like most Open Source projects, WordPress has relied on community feedback as its primary data source, which is fine for a small project. The problem is WordPress is a Very Big Project with global reach and the majority of its users never interface with the community.

I like to say we, the people who talk about, provide feedback for, and design/develop WordPress are the 1%. It might be more like 0.1%. Making decisions based on the traditional community feedback model is making decisions without knowing anything about the majority of users. Some will argue this is fine, that WordPress is developed by those who show up. That’s not a workable or responsible model for a project. We, the people who build WordPress, have a duty of care to the people we build it for. And those people are not us. ‘We can just do user testing,’ you say? Sure. Let’s do proper qualitative user testing. That requires staffing, funding, and infrastructure. User testing for a project like WordPress is non-trivial. It requires professional analysis.

Rand-Hendriksen’s tweetstorm continued with a summary of his telemetry proposal which would be opt-in based on a plugin prompted from core. The plugin would anonymize all collected data and allow for targeted data collection based on research needs. He proposes that the data be stored on servers owned by the community, separate from corporate interests, so the data can be shared openly to ensure transparency. The ticket for this feature request is currently closed.

“There’s a ton going on, and it’s far more important than built-in big brother centralized tracking,” Mullenweg said in response to Rand-Hendriksen’s tweetstorm. “Do it as a plugin or with a host and show it informs a decision that we wouldn’t have taken otherwise. And remember that past usage is not a good predictor of future success, or what the world needs. We need to build iPhones not Blackberries.”

During the 2016 State of the Word address, Mullenweg proposed a new structure for core releases in 2017 where he would be putting on the ‘product lead’ hat and have design and user testing lead the way. As feature requests have popped up outside of the three core focus areas, Mullenweg has had to systematically shut them down or put them on hold for later in order to keep Gutenberg on track.

However, it’s not surprising that the engineers leading the Gutenberg project, most of whom are employed by Automattic, wouldn’t think twice about adding user tracking. The company has a blog entirely devoted to data where its data scientists write about the data pipelines they have built to help the company create a sustainable business. Historically, Automattic has strongly embraced using data in making decisions, which is why Calypso has event tracking built into it. Mullenweg is taking a different product leadership approach with the open source WordPress project.

“For people unhappy with our direction, no amount of data will change their minds,” Mullenweg said in response to critics on Twitter. “The results will tell. I’m happy to stand by them the past 14 years, and believe the next 14 will validate our approach.”

21 Comments


  1. The results will tell. I’m happy to stand by them the past 14 years, and believe the next 14 will validate our approach.

    It’s possible that a project succeeds in spite of a decision rather than because of it. WordPress has succeeded in its first 14 years without telemetry and will quite possibly continue to succeed in the next 14 years without it.

    But success is a spectrum, and I worry that treating it as a binary state prevents us from being even more successful through ideas like telemetry. If we can build a successful product without data, imagine what we could do with it.

    Report


  2. I respect Matt a lot, but why does he seem so anti-data?

    I doubt that anything actionable or useful will come of it that couldn’t be obtained by non-data-collecting means.

    “Doubt?” Why not just collect it and see if it’s helpful? It sounds kinda like a scared/scarcity mentality. Like if we collect it, it’s going to be this huge waste. At the very least, it would tell us if it’s NOT useful. Or it could be interpreted that he’s concerned it’ll overrule him, or at least add a lot of fuel for those arguing differently than his vision for Gutenberg (and maybe WP in general).

    I can’t speak for everyone, but I think most smart people get the idea that (for example) Steve Jobs and his team created the iPhone not based on data or past experience, but on what they themselves wanted and what their gut instinct told them. Just because data might support one thing or another, doesn’t mean there’s not room to overrule it, or give people what “they don’t know they want, yet” as the saying goes. It just means there’s more info to make that decision from. And for smaller decisions, it takes away the guesswork and allows the amount of decisions that are made to be cut way down. They can just “follow the data” on that stuff. It makes the decision-makers much more effective at making those fewer, much more important decisions, that they’re left with.

    I personally would like to see a lot more data across all aspects of WordPress.org. As the previous commenter said, just because WP is succeeding without it, doesn’t mean it couldn’t be a lot better with it.

    Report


    1. I love data! We have a ton of it. Some advocates want us to add more extensive click/feature tracking to core which will significantly shift the privacy profile of core WordPress in a way I’m profoundly uncomfortable with, and that will require significant development (I would estimate more than 6-9 months) to instrument, store, process, and display in any useful way. That estimate is based on having developed and iterated systems to do exactly that for Automattic products several times in the past. Also without knowing a fair amount about the sites and people submitting the data, it will be challenging to profile or understand what types of users are submitting which data — do they manage one site or many? What’s their primary language? What’s their technical level, and what’s their goal with WP? Is there something they want to accomplish that we can’t track with clicks? Are they multi-device? Are there barriers to how they interact with their device?

      We get a ton of data from user tests of our own and competitive products (something data collection in WP would never provide), working with people of various experience levels with WP, helping developers use the new APIs and getting their feedback, the dozens of meetup and WordCamp presentations thus far, and of course if we’re blocked and something more widespread and anonymous is needed there are web hosts that can share data without us needing to add more call-home, privacy-damaging features in core.

      That’s helpful today, vs anonymous aggregated data collection in core WP we couldn’t get any useful info from before next year. User-based methods of data collection and testing show there is a ton of work that needs to be done, and reveal issues that would never be shown by knowing how many times a block in Gutenberg is used. We have many months of work ahead based on what we know today. We’ll also keep an eye on usage data that hosts collect to see if there’s anything we’re missing. People seem to be arguing with the straw man that we’re developing core blind, or that I don’t know how to use data, or that I don’t care about data, none of which is remotely true. What I see a lot more is that they disagree with a decision and want a number they can point at and argue about.

      There are other types of usage information, particularly for the plugin and theme directory, that take much less time to develop and provide actionable, useful information to developers they couldn’t get otherwise. Those features we’ll invest time and server resources in.

      Report


      1. People seem to be arguing with the straw man that we’re developing core blind, or that I don’t know how to use data, or that I don’t care about data, none of which is remotely true. What I see a lot more is that they disagree with a decision and want a number they can point at and argue about.

        First of all, thank you Matt for providing a detailed response to the discussion. Secondly, this assertion you make is not an accurate representation of the conversation. If this is truly how you’ve interpreted the discussion thus far I wish you would have reached out for clarification. The argument was never “we’re developing core blind” or that anyone in the project “doesn’t know how to use data” or “doesn’t care about data”. My assertion, and that of many others was always that lack of data in a project this big gives us a significant blind spot, and the efforts that are currently in place to remedy this problem are too focussed on the active contributors to the project, not the people out there in the world using the application. Not “we’re developing core blind”, rather “we may be inadvertently working in an echo chamber, and we have no way of knowing if that’s the case so we should make every effort to find out.” This is an issue that comes up repeatedly, especially around the 80/20 conversation which you yourself have argued is becoming problematic due to a lack of data.

        You have legitimate arguments against telemetry, and those arguments belong in an open discussion about the feature. Others have legitimate arguments for telemetry that even if you disagree with them deserve that discussion.

        This is a debate we need to have, together, on even ground. Otherwise it gets stuck in endless circular arguments and assumptions about the intentions of the other side.

        We all have the same goal: To make WordPress better for the people who use WordPress. We need to come together and discuss how we make decisions that meet those goals in the best possible way. That starts with open conversation.

        Report


  3. “Do it as a plugin or with a host and show it informs a decision that we wouldn’t have taken otherwise.”

    Has the data that Calypso collects helped Automattic make informed decisions that they wouldn’t have taken otherwise?

    If it has, why wouldn’t the same apply to WordPress and the wider WP community?

    If it hasn’t, then why does Calypso collect it?

    Report


    1. Also if telemetry was attempted with a plugin then only those aware of the plugin would install it and opt in. This would open the door to the same criticism that has plagued surveys and other past attempts to gather data – which is that the responses are from the 1% and written off as developer feedback. The entire point of placing it in core is to gather data from everyday users and prevent developer-centric decision-making.

      Report


  4. I welcome that telemetry will be not in WP core. On other hand, how all these decisions are made is definitely not the “open source way”. One day somebody decide it will be in Gutenberg, other day somebody else decide it will be not there.

    Report


  5. Any telemetry should be opt-in only. The open source community is passionately anti-spyware. I’d have to reconsider recommending WordPress to our clients and start looking at alternative platforms if WordPress core starts in with telemetry.

    Without any telemetry, I’m can predict about half the talented developers who work on WordPress in their free time (i.e. those who are not on Automattic’s payroll) would find something better to do with their time than help build spyware.

    I disagree with Matt Mullenweg about a lot but in this case (as with his support of freedom of speech), he’s right on the money.

    Report


    1. The proposal in the Trac ticket clearly states that telemetry would be opt-in only. Data must be 100% anonymous and all date collected would be available to the public. Every one of your concerns is accounted for. Please read the ticket.

      Report


  6. Putting telemetry in core to the side for a minute: the removal of telemetry from the Gutenberg beta plugin is odd. The purpose of that tracking as I understand it was to look at what blocks people use and what blocks are ignored. This is information of primary importance as Gutenberg is refined, and removing the ability to track that data means the team has to guess at how to prioritize and select blocks.

    Most arguments against telemetry in core do not apply to the Gutenberg plugin: the plugin is beta, so any user would deliberately install it specifically to test it (and hopefully provide feedback), and telemetry in the plugin was opt-in.

    I have not been able to find any explanation as to why the feature was removed from Gutenberg other than that it “probably” wouldn’t have produced usable data – an assumption that begs numerous questions.

    Considering one of the major critiques of Gutenberg is the overabundance of blocks and how hard it is to find the most useful blocks, telemetry seems not just appropriate but essential. It’s removal is unfortunate.

    Report


  7. With all the spying going on, by all sorts of institutions and companies and in every way, right into being built into the OS itself with that piece of … that’s W10, I really think that putting “telemetry” into anything now just screams that you’re making a specific point of alienating privacy-aware users or in fact any that could be described as at least somewhat advanced.
    Also, if you add it as opt-in, which is definitely the only acceptable way if you add it at all, then you’ll be missing just the information about the average user, who just leaves the defaults and expects everything to just work. While if you add it as opt-out, you’ll get that baseline info, but you’ll be missing anything from most of the more advanced crowd, who’ll quickly turn it off, plus that it’d be a nasty move in itself. Either way, it won’t provide a valid picture to base decisions on.

    Report


    1. The ticket already states that it would be 100% opt-in. The telemetry that *was* in the Gutenberg plugin at 0.7 was double opt-in because you not only had to enable it explicitly to collect data, but also had to actively seek out and install the plugin to begin with.

      Report


    2. How do you propose, without telemetry, to see a valid picture of how people are using WordPress? Telemetry wouldn’t tell you everything, but I imagine it might tell you some things you otherwise might not have known.

      Why should WordPress move in one direction versus another? Because of gut instinct, simply because, competitors, it’s trendy?

      I’d like to see some data and research that explains why WordPress needs to move in a certain direction with certain features.

      Report


      1. User surveys, built into the admin itself, accessible at all times as feedback and (with an opt-out for those who don’t want to be bothered) popping up at certain times (when community feedback is desired and maybe also every X months in general), and possibly also when certain features are accessed (for the first time, newly implemented, considered for change or deprecation or removal etc.). Yes, it will tell you what people think they do, which may be different from what they actually do, but it will also tell you what they’d want in terms of future development, which telemetry won’t. And sure, it’ll just be the opinion of those who care to give it, but if they wouldn’t even care to check some boxes and maybe fill some suggestions now and then, I’d say they don’t have an opinion to give anyway.

        Report


      2. Based on what I’ve learned from plugin authors who are using tools like Freemius to generate a feedback mechanism to learn why users are disabling their plugins, this idea works. Many plugin authors have improved their plugins thanks to this kind of data, data that is not available via WordPress.org.

        Report


      3. WP is not a plugin. It should not react how and where users clicks. WordPress is not a business. Also if I look back, I don’t see WP as something what show and brings new trends or solutions. WP just follows online state. Beginners don’t know how to create a website, what is the best way and how to achieve their goals. They have to be guided. Not sure if advanced users like telemetry inside their business websites. If opt-in telemetry (what is relatively acceptable in WP) it can be a plugin, if they want it for whatever reason, they can install it. It’s just 2 or 3 more clicks without necessity to be included in the core. But these data will be almost useless.

        I know that somehow it’s a trend to measure everything – even different color of buttons. Majority of users don’t use still these keys are not removed from keyboards ;)

        Report


      4. @Cavalary – Those surveys will only reflect those who are highly motivated to fill them out. It won’t be an accurate picture.

        Not to mention it won’t cover every aspect of WordPress that could be tracked. Otherwise it would end up being a huge lost of “Do you use X?” questions and such. As motivated as I am to help WP improve, a survey like that would tax my patience.

        J A Konrath touched on a similar idea in publishing if Amazon would give authors access to reading data for Kindle Unlimited. Since Amazon tracks readers to pay authors on a per-page basis, imagine if an author was able to see that most everyone stopped reading around page 105. That would tell them “Hey! Something’s wrong at about this point!” and they can make edits.

        Same principal applies here. Telemetry allows them to see what features are being used and which ones are not which allows them to realign priorities and even move things out of core.

        Report


      5. “…it will also tell you what they’d want in terms of future development…”

        Users are notoriously bad at explaining this. As a long time product manager I’ve seen this over and over. I’ve had people as for features that already existed. I’ve had people ask for features but, when we dove into what they were trying to do we found that they could accomplish their task just fine with existing features.

        User surveys also require product staff to collect, analyze and summarize the results.

        Report


    3. Opt-in does not mean hidden. Apple, for example, asks you to allow it to gather data when you setup a new iOS device or Mac. They make the choice obvious, they explain what it will do in brief and they don’t bug you about it. Some people allow it, some disable it. Enough people allow data collection to happen that they get a representative sample.

      WordPress would too – you don’t need a majority of users to allow data collection in order to see valid results, you just need a sample that represents the various uses and for the sample to be large enough so that it’s statistically valid. Given the size of the WP community none of this will be a problem if it’s done right.

      Report


  8. As soon as I saw this post title I was all WTF (index finger left to right in slow motion), but there is a difference between User Tracking vs Anonymous Usage Tracking.

    Report


  9. ah matt…

    We need to build iPhones not Blackberries.

    Apple has opt-in data collection PRECISELY so that they can understand how people use their iPhones. Sometimes I wonder about Matt.

    Report

Comments are closed.