4 Comments

  1. thejeffmatson

    My biggest concern is how many of those subdomains with fresh, ready to install copies of WordPress exist. I could potentially scrape based on those subdomains that are indexed, locate what I’m guessing would be hundreds to thousands of that same scenario, and go ballistic.

    Something certainly needs to be changed here. A staging area is good, but exposing all of those is a serious potential security risk.

    Reply

    1. New installs on WP Engine ship fully configured. The only time there’s a fresh install of WordPress is if the customer decides to clear out the database themselves and reinstall their environment.

      So don’t worry, there’s not a fleet of WP Engine installs waiting to be taken over by the first person who comes along and completes the install process. :)

      Reply

  2. Thanks for the follow-up Sarah. I had some reservations about rolling with Jacob’s post because I knew there would be an inevitable fallout of some sort. What really pushed us to move forward with it was the fact that pushing it out would mean helping a ton of people. To me, that meant the benefits outweighed any potential backlash.

    No matter what Jason, or Joost, or anyone says about “fallacies”, the truth is that the issues Jacob pointed out are real. Just like anything in SEO, the severity can be debated but in my experience duplicate content can really screw you, especially if you’re a relatively new site. A huge site might be able to afford leaking out a bit of their authority here and there. On the other hand, someone just getting off the ground needs every advantage they can get. Hanging on to every ounce of domain equity can really help in that department and sometimes can be a make-or-break factor.

    I’m really happy to hear that WP Engine is going to be working on implementing some changes to correct this stuff. The primary goal of this post was to help WP Engine’s customers take care of this issue themselves in case WP Engine decided it wasn’t enough of a problem to change their system. The secondary reason was to push WP Engine to make some real core-level changes to their server setup and it appears that we’ve managed to do that. Seems like a win-win situation to me. :)

    Reply

  3. I think the “duplicate content” issue is overblown. There’s no such thing as a “duplicate content penalty” otherwise thousands of WordPress sites would get penalized right out of the box. Duplicate content is displayed on archive pages, index pages, search results, in addition to the main article page, by default.

    I generally see this manifest itself in Google results when things like /page/12/ are ranked ahead of the main article. It’s probably best to use a SEO plugin to clean that sort of stuff up. Canonical URLs (WordPress default) help too, but it’s not a guarantee.

    While this is a separate issue since we’re talking about content on separate domains, the alleged Matt Cutts conversation about a list of “root domains” for sites like WP Engine is interesting, as I’ve never heard of anything like that before, but it makes sense.

    Just because it’s indexed, doesn’t mean it ranks for any meaningful keywords. I see a lot of WP Engine-hosted sites in search results, and don’t think I’ve ever seen the .wpengine.com version rank at all. So I tend to believe this something is going on to prevent this, whether it be the “root domain” thing or some other algorithmic check to make sure main sites rank and staging sites don’t.

    On the privacy issues, yeah, it’s definitely a little too easy to grab all those *.wpengine.com staging URLs with a simple Google query. There’s no reason for those to be indexed at all, as support can always access the subdomain directly.

    But it’s also worth noting you can do reverse IP checks and find out the URLs hosted on ANY server pretty easily as well. Is it really WP Engine’s fault that some unlaunched Harvard website exposed itself to the entire internet? That’s the developer’s responsibility to keep things under wraps, not the host.

    I’m not even going to comment on how the whole #FailboatGate situation played out on Twitter and blog comments elsewhere, but both sides could’ve probably handled things a bit better.

    Reply

Leave a Reply