Phoning Home To Plugin Authors

Reading through the WordPress Hackers Mailing list, Nuno Morgadinho wanted to know how to track user engagement with a commercial plugin that is being developed. The metrics that they were most interested in were the following:

- how much time has the user spent playing with my plugin since plugin
activation ;
- what is the normal usage of the plugin (once a month? once a week?
once a day?) ;
- while navigating through the plugin does the user go back and forth
a lot of does he follow a certain pattern?;
- etc.

While the developer would like to use this information to improve the experience of using the plugin, I can already see the people with pitchforks lining up to take this developer out if were not done correctly. Thankfully, Eric Mann has already chimed in with words of warning about how users do not like to find out about third party tracking, especially after it’s already occurred without knowing about it up front. Personally, I have no problem with what the plugin author is trying to achieve as long as I have the option to say no aka, Opt-Out or more preferably, Opt-In. I’m willing to bet that most WordPress website owners feel the same way. If not, feel free to tell me within the comments of this post.

However, I have to point out that according to the WordPress Plugin Repository Guidelines, plugins are not allowed to “phone home” without the user’s informed consent.

No “phoning home” without user’s informed consent. This seemingly simple rule actually covers several different aspects:

No unauthorized collection of user data. For example, sending the admin’s email address back to your own servers without permission of the user is not allowed; but asking the user for an email address and collecting if they choose to submit it is fine. All actions taken in this respect MUST be of the user’s doing, not automatically done by the plugin.

All images and scripts shown should be part of the plugin. These should be loaded locally. If the plugin does require that data is loaded from an external site (such as blocklists) this should be made clear in the plugin’s admin screens or description. The point is that the user must be informed of what information is being sent where.

In general, things like banner or text link advertising should not be anywhere in a plugin, including on its settings screen. Advertising on settings screens is generally ineffective anyway, as ideally users rarely visit these screens, and the advertising is low quality because the advertising systems cannot see the page content to determine good ads. So they’re best just left off entirely. Putting links back to your own site or to your social-network of choice is fine. If the plugin does include advertising from a third party service, then it must default to completely disabled, in order to prevent tracking information from being collected from the user without their consent. This is the method commonly known as “opt-in”.

Note that if you do include what we consider to be “advertising spam”, or attempt to game somebody else’s advertising system, then we will not only remove your plugin, but also report your code to the advertising system’s abuse mechanism as well. We do not react kindly to spam. Don’t try it.

After reading those guidelines concerning phoning home, consider that WordPress itself phones home data without the user ever having a chance to make an informed decision on whether to allow it or not. If you have time and want to read a passionate and heated discussion centered around this very topic, I encourage you to read the following forum thread – WordPress And Phone Home, started in 2009 by Elpie. Within the thread are arguments on what should and shouldn’t be collected, how disclosure should be handled, what is and is not publicly available information, last but not least, reasons as to why what WordPress does and how it does it is ok. While I’m a big fan of the repository guidelines, I don’t understand why plugin authors have to phone home with informed user consent while WordPress can phone home without informed user consent. What’s the difference between the two?

If you’re interested in knowing what data is sent back from a WordPress installation back to the mothership, Eplie has laid out a detailed post showing exactly what is sent.

*UPDATE* According to Otto, Core, Theme, and Plugin update checks do not phone home to WordPress.org.

12 Comments


  1. Note: *In my opinion*, they’re not “phoning home”. Information like PHP and MySQL version is indeed being sent back as part of the core update call, but the information is necessary information to get the proper core update response. Similarly, plugin and theme api checks send information like the plugin and theme headers, in order for the api code to be able to find the right plugins and themes to check updates for.

    It is perfectly reasonable for other people to see these API calls differently and say that this is “phoning home”, and I can understand their point of view. For this reason, plugins exist to disable those update checks if somebody wants to do so.

    Report


  2. Once upon a time, all of my plugins called home. I added an XML-RPC hook to my server to capture the calls that told me which plugin was installed, which version, what versions of PHP/WP/MySQL were installed, and the email address of the site administrator. They would also immediately send me any reports of crashes or errors so I could debug things proactively and release updates that targeted specific, observed user issues.

    The information was invaluable.

    But I gradually realized the problem with my system. I was, essentially, farming a huge email list on my site. I had contact information for thousands of users, the names of sites they administered, and detailed diagnostic information about their servers.

    And they had no idea.

    Then someone using a plugin I wrote in a tutorial found the code, loved it, forked it, and added it to a few of their own systems. Suddenly I had a flood of data from other plugins (they forgot to change the XML-RPC server endpoint) … again, without user input or permission.

    The thought that someone other than me could be collecting this data scared me to death. I trust me. But I also know some of the people who use my code, and I DON’T trust them.

    So, until I rewrite my API to allow anonymous, opt-in communication, I’ve pulled it from every system I write. I’ve also deleted any records I have on my end because, really, I don’t want to be responsible for holding on to an extensive list of other people’s email addresses.

    Report


  3. I’ve often thought that the best approach to handling this sort of thing, is to simply replicate what WordPress core does. If you set your plugins up to auto-update from YOUR site instead of WordPress.org, then you would have access to that data too. That would not require an opt-in, but would allow you to analyse data that you happened to access via the update API.

    Of course, this then means you lose out on “downloads” on your plugin or theme page on WordPress.org.

    FYI, I have not done this to any of my own plugins hosted at WordPress.org. It’s just an idea I had a while back.

    Report


  4. Ubuntu’s system is the best around.

    Mozilla is walking itself out off a cliff.

    WordPress is well-loved, and has an inner circle that used to be so small as to be a singularity. So like your mother, it could get away with crap.

    Just modify the current Update menu item in Admin so that it is the users de facto “Opt in” agreement. It works mostly this way now … except for that nagging issue of it taking it upon itself to check certain ‘vital’ WP info, just because you happened to log in to Admin.

    WP gets away with it because they’re everybody’s trusted sweetheart.

    As WordPress grows, as inner responsibilities & authority is distributed and delegated, practices like these become greater liabilities.

    Facebook is sleazy. They’re successful, and powerful, and impressive. But they suck. Nobody feels sweet for Facebook; nobody trusts them to take out the garbage.

    Downside, yes, the installed WordPress base ages & decays. There will be exploits, using stuff that has been fixed in recent updates … which some people can’t be bothered to check for on their own. That’s why WP does it for them … to protect WP’s good name, to minimize those painful security traumas, of which there have been a few.

    Goose-stepping release cycles are a big part of the problem. Recently, we have watched it morph into an outright frenzy. Google goaded Mozilla into clinical hysteria, and they’re defending it to the hilt. Release-madness pushes coding, and deprecates testing. “Upgrades” replace validation.

    Report


  5. @Otto -I agree with Otto and would take it a step further. While, as a website owner, I may feel a little concerned about a plugin “phoning home” reporting how I use it, frankly if that’s the cost I have to pay for a free plugin that allows me or even my clients to extend the core’s functionality, then it’s a small price to pay and I’m willing to ante up. Frankly, I assumed that in this day and age that all free plugins or themes reported back to some extent and am (pleasantly) surprised to learn that WP holds plugin developers to such a high standard.

    Ted’s reply is typical whining from people who gladly take free software, complain when it’s not perfect, and expect not to have to contribute.

    I’m not a developer, I can’t contribute to the WP core or to a plugin project in that way, so if they need to track my usage, I’m happy to oblige.

    Report


  6. @Daniel

    Ted’s reply is typical whining from people who gladly take free software, complain when it’s not perfect, and expect not to have to contribute.

    Free Open Source Software is not the wilted lettuce and day-old bread that they sit out on the loading dock in the alley, for the unfortunate to pick through.

    There is still a lively debate, whether closed commercial paradigms are inherently handicapped, in comparison to open & free … but GNU/GPL projects have stayed right at the front of the pack for long enough now that nobody mistakes it for second-rate.

    No, the implication that of course one will pare off the moldy spots and pretend to prefer soft, rubbery greens, when he accepts free programs, misses the main facts about FOSS.

    Matt Mullenweg gives the product away, not just to crummy people and ingrates, but to everyone. And inspecting it for bugs, rot and suitability to purpose certainly is an important role of those who put it to use.

    Report


  7. @Daniel – Degrading privacy concerns to whining is just lame and misses the point completely. I feel sorry for you.

    Phoning home private things like my URLs and e-mail addresses is not okay in my books.

    Report


  8. We did some analysis on this a while back: http://interconnectit.com/1722/who-is-wordpress-talking-to/ – some seems not too sinister, but you do have to remember that there’s little in the way of protection of the data sent, and that somewhere there is possibly a big goldmine of install data that isn’t being shared with anybody.

    Due to the relative opacity of the organisations behind WP there’s no real easy answer – we can’t see in very easily to anything other than the code.

    Somebody will fork WP one day or another, if only to address the concerns that are required for enterprise use. That sensitive content posted to an Intranet site could find itself elsewhere (for example when using the Google API for spell checking) is a big concern.

    It’s perfectly valid to question these things and to look into the potential impacts for different groups of users. It may not matter if you’re a blogger in NYC but it may matter a lot to an activist in Cuba or an internal communications team at a government agency.

    Report


  9. @David Coveney

    The article on interconnect/it that you link to, What Exactly Does WordPress Tell The World? describes the concerns/motivations, and the particulars of the technical tests selected to investigate those questions, very well.

    The website itself, it’s homepage, has more in this general vein.

    Those who provide services to ‘non-IT’ businesses, who want to use IT but are not especially ‘into’ IT themselves, are both somewhat obliged and so-to-speak ‘given permission’ to leave behind a lot of the usual ‘postures & dance’ of Computerdom, which themselves can be persistent issues.

    I am not a business-person, and am not into IT business-services, but I am an experienced technical/systems person, and the way these tests were set up & described in the article is very good.

    Thanks!

    Report


  10. What has always bugged me is the the installed plugins/theme info sent to wp.org. WordPress.org can identify each site by url and the plugins/themes installed and/or activated on it uniquely, along with the full plugin headers (author, name, author site, etc). And most importantly they do that for ALL plugins/themes, whether or not they came from the repo and have a valid reason for being sent (checking for updates).

    Most worrying is the conflict of interest when it comes to WordPress.org vs Automattic. What kind of privacy policy covers that data? Does Automattic the for-profit corporation have access to this valuable information on their potential plugin, theme, and service competitors? Even if there are strict protocols in place there, the same guy runs both so there is information overlap. If I were building a competing plugin or service to one of automattic’s (akismet, VaultPress, etc), they would have valuable data I don’t for things like gauging potential demand or keeping an eye on competitors.

    So why is wordpress sending info on things it doesn’t? It should only send:
    – Installed themes/plugins FROM THE REPO
    – Their active status (useful for repo stats)
    – Only their slug + version number (all header data is extraneous)

    When I designed my own update notifications plugin (http://premium.wpmudev.org/project/update-notifications), I made sure it only sent the list of installed plugins of ours, as that’s all that’s needed and all that we have a right to.

    Report


  11. We just went through quite a bit of discussion about this issue internally. The end result was not to do anything for the moment, except we included a “submit a support ticket” system in our plugin.

    Initially, I had gathered some site meta data to include with the ticket; but I realized that I was breaking a cardinal rule with interacting with customers, so now we include a “Here is what will be included in the support ticket” box with *exactly* what we’re going to poll. On support tickets especially, we’re virtually always going to ask the same handful of things: What version of WP are you running, what’s your current theme, and what plugins are active? Now we can just gather those facts with the request.

    Of course, we’ve tried to further make it clear that if the customer doesn’t want to include that information, they can email us directly for support. Unfortunately; those are the exact things that we need for virtually any support request!

    Good post though; I think as plugins & services for WP get more sophisticated it’s important that the plugin & theme development communities stay in front of this. Watching the current crop of stuff coming up around this whole iPhone “address-book-gate” I think respecting our customer / user privacy & data is more important than ever. Even things that we might interpret as trivial or aggregate and unthreatening.

    Report


  12. Dave’s Dashter certainly does it right, prefilling a form that clearly indicates what will be sent and sending it. Kudos for that (and the very awesome Dashter too).

    I do wish someone you come up with a chunk of code that ALL plugin authors not using the wp.org repo could use the check for update and auto-install. In fact I really think this should be in core, although I suspect the powers that be would strongly object to that. Having a system built into core would at least standardize the data policy surround updates and potentially around bug-reports too.

    Aaron does raise a good point about data sharing between wp.org and Automattic… would be interested in learning the answer to that.

    Report

Comments are closed.