What’s Going On With The WordPress Importer?

WordPress Importer RatingsA few days ago, I received an email informing me that the WordPress importer plugin was having issues and was at the brink of receiving as many 1 star ratings as it had 5. It’s never a good sign when you see support forum stats like this: 0 of 10 support threads in the last two months have been resolved. But considering the topic creator has to mark the topic as resolved, this is not a fool-proof method of determining if a plugin is broken. From what I can tell, most of the issues surround partial imports and the importing of images.

I used the importer to move all the content from WPTavern.com to a local install of WordPress. I chose to do a full export/import. The WXR file was around 17 megs but WordPress seemed to take forever to process the file, even on my local machine. While all the images were imported properly, they were not attached to the proper posts. The images that were displayed on the local site were linked to the ones within the media library on WPTavern.com. After dropping all the tables from the database and starting over, I tried to import my files again with the same results. Just to see what would happen, I exported only my posts with attachments to see if both would be imported correctly. While the posts were imported correctly, none of the media attachments were brought over into the media library. As illustrated in the support forum post I linked above, the only way to move media attachments from one site to another is to export/import EVERYTHING. If I select attachments to be downloaded and exported, then that’s what should happen no matter which variety of ways I export the content.

No Media Attachments

It’s been a long time since I’ve exported/imported content from one WordPress site to another so I don’t remember how the process is suppose to play out but if the behavior I described above is considered normal, the import process is definitely not as painless as it should be. After all, exporting/importing from WordPress to WordPress should be as easy as counting 1,2,3. Unlike other plugins that have a specific author, this one is developed by the WordPressdotorg team so it’s not clear who to contact.

If you have used the WordPress export/import plugin within the past few weeks and have also run into problems, I’d love for you to chime in on the comments. I’d especially like to hear from those that export/import sites on a routine basis.

40

40 responses to “What’s Going On With The WordPress Importer?”

  1. I’ve been seeing lots of issues with a category such as “cats” being duplicated over and over when importing, so there’s cat-1, cat-2, cat-3, etc. listed as categories and you have to clean them up …. just one issue I’ve seen ….

  2. I think this attachments problem is a lot older than the past few weeks. Even when importing from WordPress.com I’ve run into the issue. I almost always end up doing some sort of partial import, but I’ve consistently had this problem since at least six months ago. When I brought the issue up with the WordPress.com support they just threw up their hands and blamed the importing server but that never quite felt right.

    (Also, as a side note, the plugin author can also mark topics as resolved.)

  3. The WP importer definitely has problems, especially with larger import files. Although I haven’t tried to grok the issue fully, I think at least one fairly common scenario goes basically something like this:

    1. Exports that include attachments tend to put those first in the file iirc.
    2. The handling/backfilling of attachment/post relationships happens after everything’s imported.
    3. Different servers have different limits for things like memory use and timeouts.
    4. So you get a site with a bunch of attachments, or perhaps with attachments being retrieved from a slow server, and the attachments start processing, but the importer winds up dying basically silently. This can keep the backfilling from ever happening, since it happens near the end. Even if you restart the import, since the relationships between media and posts will have been stored for the prior run during the prior run (and not persisted for use in the current run), the relationships will never be created.

    There are probably all sorts of other failure scenarios too. This is just one that I’ve seen. It’s definitely not a new problem. And it’s hard to fix on self-hosted environments where there’s not great control over things like timeouts and retries. I’d personally like to see a Jetpackesque service that will broker imports to and from various platforms and that’s more robust as a result, but of course such a thing has plenty of problems of its own.

    There are a couple of plugins that’ll do image attachment on demand for you should you get a dud import. It’s not optimal, I know, but then I’m not at all sure it’s possible to optimize the current importer for use with big imports on the varying hosting setups. And there are only so many hours in the day for rewriting the importer, etc. I’d be curious, if you bump your local max_timeout and memory_limit settings to something absurdly high, whether the import works better or not. I predict that with the limits bumped enough, the import’ll work much better (again, I know, not a great solution — just, if I’m correct — validation that we’re dealing with a resource issue as the current importer stands).

  4. I notice that the importers on WordPress.com are a lot flashier than those made available on WordPress.org. I don’t know if this is just cosmetic or if the internals have been updated too.

    The importers are based on a shared php class but that class does not do any of the hard stuff, for example like moving images.

  5. I’ve had the same issue, this time with featured images. None of my imported posts had a featured image attached. And I can’t take each post one by one and set the featured image manually.

    It’s been a while since I saw an update for the WordPress Importer and I think, after the release of 3.6, they should focus a bit on the external plugins, such as this one.

  6. Importing large sites via the import/export tool has never worked very well. It would be great if this could be improved to make it a truly useful product. The ability to export complete sites including images etc. (in a zip file perhaps) would be nice too.

  7. I’ve experienced several issues with the importer across different installs and servers, which ultimately prompted me to become the “whistle blower”. A failing importer impedes users’ freedom by making it much less feasible for them to move away from services like WordPress.com to their own, self-hosted environments.

    What’s also troubling is how little attention the plugin seems to be getting from its developers. Developers can’t be expected to check in on every support ticket, especially not on the most popular plugins, doubly so where customization requests and such are concerned. But I consider it any good developer’s duty to maintain at the very least a high level overview of your user’s experience. Watch the trends. In the case of the WP importer, the trend is obvious; it’s broken.

    That said, it could actually be made clearer. Though it should definitely be easier to use (e.g. directly from your dashboard), more people need to take advantage of the “broken/works” feature. It’s this lovely little idea that never properly came to fruition. Almost 5 million downloads, with a grand total of 11 reports, 4 of which say “broken”. I’ve already seen 4 more would-be “broken” reports in the comments of this post.

  8. Obviously it’s not a help if you are not a coder (with loads of spare time) but when I found a bunch of issues with the blogger importer, I set about fixing them.

    That was about a year and a half ago, I managed to consolidate a bunch of changes and fixes in 6 months of odd evenings and get the 0.5 version out which was 1 year ago today. The 0.6 version that handles the image has taken a bit longer, but it has been in beta testing for quite a few months now. I’ve one more key issue to fix with regards to refreshing the screen which I believe is a combination of how the importer stores data and how the Ajax is being handled (both inherited design decisions)

    You can read about some of the issues and see the latest code here, if that’s any help.

    http://core.trac.wordpress.org/ticket/4010

    p.s. Yes I appreciate that the WordPress data files are a lot different but you might be able to get something useful.

  9. I’ve recently changed hosts and thought the easy option was to just export and import my blog.
    But then I ran into the images problem… I’ve had to manually upload and reattach nearly all the media files to the corresponding posts and pages, which was not a process which I would like to repeat.

    I have succesfully exported and imported my site in the past, so I didn’t expect this problem at all. I do hope the importer will be fixed, because it is really annoying not to be able to do this without problems.

  10. I routinely need to take a live site down to a local server, modify it, push it to a staging server, then to a live site. Often the livesite is runing along in the meantime, and new content is being posted while the new version is being developed.

    So before the new version of the site is finally moved to the live server, the latest content need to be merged from the ‘old’ live site.

    This process is always a pain, because the WP Importer is not solid. Impoted posts loose their featured images, and images in the content link to the old site. The media library doesn’t show the images that the importer actually do import into the uploads folder.

    We need a better export/import plugin, and I’d be happy to pay for it.

  11. @Frederick Ding – Hey there, nice to see there is someone slated to spend a bunch of time on the importer and the other portability issues of WordPress. The issues addressed in this post and on the support forum are not recent, but have existed for quite awhile.

    A few suggestions I had:

    It doesn’t matter what information is exporter whether it’s everything or just posts, or pages. If the person chooses to download attachments, those should be moved over with the content. Currently, the only way to get attachments from one place to another is to select the option to export EVERYTHING.

    Images should be properly attached to their corresponding posts

    Image source URLs should be changed to reflect their new location.

    I’d like to see some sort of progress or information bar that shows what is happening during the import process. It sure is a drag to just sit on a page and watch a spinning circle without knowing if the import broke or not.

    If I think of anymore, I’ll publish them in this comment thread.

    Good luck Frederick, you have your hands full.

  12. @speedyk – my initial thoughts on doing migrations are based on file & SQL (similar to, but not identical to, some of the existing plugin solutions out there).

    Attachments are a huge, huge part of what can be improved — but so too are settings, users, plugins & themes. It occurred to me when planning the project that these are areas where native file archives (.tar.gz, for example) and SQL dumps have served developers well for years. The challenge is figuring out how to avoid downloading and reuploading 1.5 GB media libraries, for one, and how to make it easier to clone an installation without having to set up a new WordPress installation first to be able to import.

    However, the exporter/importer remains an important part of WordPress, and I’m looking at what I can do to make it better. It’s a really tough problem on the development side!

  13. The underlying problem with the importer is that it doesn’t utilize the XML-RPC interface of wordpress. Importing using XML-RPC should solve all the problems related to the export file size and should give better incremental import functionality.

    AFAICT this is how the blogger importer works.

  14. The blogger importer calls the gdata API and gets posts in batches, it repeats that for the comments. In the beta version It then iterates through all the posts and downloads the imahes and again to relink internal links. Status is displayed in a wp_table.
    Progress is displayed via a jQuery progress bar.
    The xml is processed with simplepie and regular expressions for the images.
    The posts are added via the WordPress functions.

    What would be good would be a importer framework that you could easily customise for different sources and documentation too.

  15. I have used the importer three times in the last week to migrate content from one WP site to another. All content for pages, posts, custom content types, categories and comments came in well except for images. The images were in the posts but still pointing to the site I migrated from. Nothing was imported into the media library.

    Quite a pain to edit content to adjust for the images. It’s the most time consuming part of the migration. Images in the content is one thing, attached featured images should be imported without issue. This was not the case with the last series of imports.

    Am I not understanding how it works? Should it include all images into the media library or just featured images and image gallery attachments?

  16. @Mark k. – The blogger importer is a shambles too. It routinely brings in just titles with no post bodies, for example, and because you’re working with an external API, it can be pretty much impossible to debug issues like that, especially when it reproduces only intermittently with other users’ accounts.

    Another big hurdle for importers — and especially ones that poll APIs over and over to fetch data — is that they can be long-running processes, and the WP cron system isn’t ideal. If nobody’s visiting your site, your job won’t run and your import won’t finish reliably or quickly. Even if you use an ajax request to keep the import job alive from the import page, what if the browser crashes or the tab gets closed and you lose state?

    I don’t mean to throw up my hands helplessly or anything. I mean only to say that the problems around importing are complex; there’s a reason these things don’t work optimally, and there’s unfortunately not currently a real silver bullet. I’m definitely glad to see that there’s a GSOC project to try to make things better, though. :)

  17. Daryl, that’s the first I’ve heard of that particular issue with the no body to the posts and I’ve been watching the support forum for over 2 years now. Feel free to drop a message on there detailing the issue.

    http://wordpress.org/support/plugin/blogger-importer

    You are right though that the API is a double edged sword in that it means no exporting / large files is needed but on the other hand debugging can be harder. Frederick, perhaps you can architect a solution that can decouple the steps some how so that the importers can be unit tested.

    The current solution for the long running process with the blogger importer is the download in batches and store the state at the end of each batch. This means that if it stops in the middle you can start off from where you left off.

    One thing I have seen is that there are two types of users for these importers, the first are migrating a site and plan to decommission the old one once they’ve done.
    The other type of users are using the system to mirror sites or periodically update a site.

    On a separate note, the issue with large XML files is that often parsers need to load the whole file and model it as an object internally. One solution to that problem is to use a SAX style parser instead.

  18. I’ve run into some weird problem in my last import.

    I has doing a full export/import including images download, what happened has that when importing images the plugin entered in a infinite loop and the same “package” of images attached from the first post being imported where duplicated hundreds of times.

    I haven’t manage to discover what has the problem and ended doing the operation by hand, thankfully I’ve had a older “export” backup that had the most part of the posts.

  19. @Daryl – all importers need more love….

    As for the WP importer IMO there should be two variants, the current one to be used for exporting from local PC and an API based for big sites. You just can’t import a big site to a shared host because of file upload limitations, memory limitations and execution time limitations, which usually are not under the control of the site owner.

    wp cron is not the right tool in any case because the site owner will like to see the import progress and check the site when it is finished. This needs to be done with an ajax request to trigger the next batch of import when the previous finishes.

  20. @Frederick Ding – I read your proposal, but there is no place to comment there, so it comes here…

    You are crossing the line between export and backup. Export is about content not site functionality. In my experience in most cases when you move a big site it involves redesign of the site, so worrying about keeping the old theme/plugins/settings is kind of pointless for those people.
    And then there is the possibility of partial content export and import, like if you have an author on your site and at some point he wants to migrate to another site which is not his own. Here he can’t force his old look on the new site.

    As for the usage of XML-RPC, it is on in 3.5 and you need to use a plugin to turn it off.

    In any case I think that migration needs much more love then the importer. If you can add migration options like changing “local” links to the new URL scheme, that will be cool by itself even without fixing the other importer issues.

  21. I have been using the import function for theme testing content. Two issues that I came across were author descriptions were not exported and custom menu URLs did not change. The home link would be a classic example.

  22. This issue has been ongoing for a long time. It seems to either fail completely, only import media or only import posts depending on the export. As the images are not attached to the correct posts, it causes the WordPress SEO plugin to be unable to redirect visitors to the attached page as there is no attached page. The images inserted into the posts and pages are directly linked from the source and the media is not correctly imported and attached as it should be.
    I fail to understand why WordPress would state that it works with 3.6 when almost all of the support tickets are stating that media import is failing.
    Surely a bug report has been opened? Could WP be wary of removing the plugin as it forms part of WordPress and removal would cause a lot of issues.

  23. Just used the importer again to move a website form one server to another. The problem has been still the same for years: users, pages, posts, media – everything moving over smoothly, BUT the pictures are not showing up on posts and pages even though the media library has the correct attachment information.

  24. ive attempted to export/import the XML file repeatedly with NO success. initially tried it thru godaddy – page kept timing out. repeated attempts never achieved anything. then i went to hostgator – no success there. neither of these two “service” wordpress issues so I had to cancel both.

    I was then prompted to head over to the forum section of wordpress.org & look for resolutions to my error messages – “failed to import media” & “media post already exists”.

    the closest i came was being instructed to use a file splitter – this i located & repeatedly attempted to split my 15 meg file into multiple files (as far as down to .5 meg files.) & NOTHING.

    I am presently attempting the migration with dreamhost but have already encountered an error message: Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 71 bytes) in /home/ariram5/rantsthoughtsmerde.com/wp-includes/plugin.php on line 192

    dreamhost today advised I “need to setup a custom php.ini or
    phprc and increase the php memory limit.” but when i go to their link on “how to” it clearly states that its an advanced action. I am not advanced.

    I am at my wits end with this. I was told to see about getting an SQL Dump file from wordpress.com but they wont even reply to my request on this file.

    any resolutions y’all could recommend would be greatly appreciated.

    Native NYker

    (PS: sorry about the previous message & the wrong error message included – my control C function didnt work…)

Newsletter

Subscribe Via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.