1. Brian

    This is an excellent point and an issue that is pretty pervasive, both for hosted and self-hosted blogs.

    I don’t know what the answer is. I think hoping that people will back up their audio / video is unlikely. But archive.org is pretty incredible, and I also like that a project BDFL like Matt has acquired some WordPress properties for the purpose off keeping them online as well.

    I know Siobhan McKeown has struggled with some of this issue quite a bit while she’s been working on her book. She’s be a really good person to ask about it.


    • Jeff Chandler

      Which reminds me, gotta get her on the show. At least the WP Daily archives were saved through TorqueMag but you’re right, manual archiving for audio and video just doesn’t cut it. And yeah, Weblogtoolscollection.com is a property Matt acquired with at least one of the primary purposes being to archive the site.


  2. Ryan Hellyer

    I think it is an extremely important thing to do. There’s a lot of useful information and history in various WordPress sites, and letting them disappear permanently would be very sad.

    Speaking of old content, it would be nice if this page reappeared, even if just as a read-only archive … https://wptavern.com/forum


    • Jeff Chandler

      The irony is not lost on me, knowing that page is missing from the web as I wrote this post. I can always count on you to bring this stuff up :). I have the content archived, just not anywhere that is available to the public. I want to bring it back somehow, either as part of the site again or at the very least, a public archive in read-only mode.


      • Ryan Hellyer

        I’ve done a similar thing myself. I have an old forum which I accidentally let drop offline (I let the domain expire), but I still have the old database stashed away for the rainy day when I can be bothered turfing it back online for historical purposes :)


  3. Tom

    Do you know what happened with WPCandy? No new posts since a year. A few months ago I’ve asked one of their editor on twitter and she told me they will come back soon.


  4. Deborah Edwards-Onoro

    Another option is for podcasters to post transcripts for their shows. Those transcripts would be indexed and archived. Same thing is true for video. Add the transcript and the content will be archived.


  5. David Peralty

    Yeah, I’d love to have an archive of all the podcasts I’ve been on… from the WordPress Podcast with Charles Stricklin, the TechCanuck Podcast with James Cogan, PerfCast with Jeff, and of course WP Weekly… I think the ones with Charles would be the hardest to find/get at this point.


  6. Siobhan

    Archive.org is fantastic for text – I’ve used it extensively in my research. Unfortunately podcasts don’t archive so well :( I’ve been trying to get this podcast: https://web.archive.org/web/20091210030233/http://bitwiremedia.com/wordcast/wordcast-special-edition-live-may-12th-at-6pm-eastern/ but having no luck. I’ve been in touch with one of the publishers of the podcast and he doesn’t have it any more.

    Here’s a good one that I did find via archive.org though: https://web.archive.org/web/20080427183149/http://www.revolutionizeyourblog.com/askthanks.php


  7. Alex O'Brien

    I started podcasting just two months ago, using the Internet Archive to store and host my 100% Royalty-Free “Eclectic Music” Podcast, Amateur Zen. I was informed by some web how-tos (I’ll try to find and credit those sources) that one way to podcast for zero cost (as in beer) is to use the ‘Internet Archive – Feedburner – WordPress’ triumvarate. I’ve been wholly satisfied with this method, and my handful of listeners have too. Although the learning curve is steep-ish, submitting audio content to the archive is extremely easy and reliable. I’ve come across plenty of podcasts hosted there in their comprehensive 100+ “episode” glory (Note: Creating playlists of episodical ‘casts in sequence MAY entice enforcement of payment of fees to ‘Pro Audio’).

    The issue tackled in this article truly doesn’t compute with me. As long as podcasters are independent creators, they can’t expect a free automated platform to pick up where their own laziness or lack of spare time leave off. WordPress is certainly not that platform. The various podcast plug-ins don’t bother to help producers get their content submitted to archive.org either. What do they do? I’m getting snarky so..’nuff for now.

    If a podcaster intends for his/her works to be archived, that is their own responsibility – and hopefully a responsibility shared with one’s eager-to-contribute audience. If podcasters who are unable to do this make up a significant portion of the WP user-base then I’d work on a solution proritizing the creation of a simple automation script by WP. A script automating the upload of podcasts containing a given tag (‘archive’, perhaps?) to archive.org under an auto-generated account there. Perhaps we put a limit of X gigabytes on this process, after which a podcaster must go and actually interact with archive.org manually to prove his/her interest/sentience. I’m no developer, so i’ll stop there.

    To suffice, this is a solution in search of a problem.


  8. Justin Kerk

    Great post and I love anything that raises awareness about the free resources provided by archive.org. Some of the information is inaccurate though: Wayback Machine crawls do include MP3s or any other file types, provided they are served up by normal HTTP or HTTPS download links and not hidden behind flash players or MediaFire-type download sites. As with any Wayback Machine content it can be hit and miss as to whether any particular link gets archived, although they have been improving over time. I highly recommend just using the archive.org file area to host the MP3s in the first place – free bandwidth!

    Submitting your site to be archived is another very useful feature, however it will only archive the specific URL you give it and does not actually trigger a crawl of the entire site. The six months blurb you quote was in reference to the previous situation where URLs archived by Wayback would not go live until months later when the indexing and such had been completed. This is no longer the case and archived URLs are now available within seconds.

    If you have a full site that needs crawling you can bring it to the attention of the Archive Team (http://archiveteam.org/) and we have tools that can make that happen and import the results into Wayback.


Comments are closed.

%d bloggers like this: