Podcast feeds for LibriVox

Yesterday I interviewed Hugh McGuire about LibriVox for next week’s ITConversations podcast. In the course of our conversation I was reminded that LibriVox catalog pages — like this one for White Fang — include MP3s and Oggs for individual chapters, plus a zip file containing the whole book, but not an RSS feed suitable for automatic downloading into a podcatcher. And as Hugh and I discussed, a typical reaction to the zip file is: “Now what?”

So I’ve written a little script to produce RSS feeds. It seems useful to me, and I hope it’ll prove useful to the LibriVox community, but before I release it I’d like to check my assumptions.

Here are three sample feeds.

William Blake, Songs of Innocence and Experience

Arthur Conan Doyle, The Hound of the Baskervilles

Jack London, White Fang

I mostly use a Creative MuVo, and sometimes an iPod, so these are the two scenarios I’ve tested. For my purposes, the requirements in both cases are:

  • The files display and play in order in iTunes and Windows Media Player1
  • The files display and play in order on the player
  • Both the name and index of each file are easily legible on the player

The flash-memory-based MuVo seems to need the filenames shortened to 28 characters. And as I realized on a long bike ride last summer, when a book was playing out of order, it also seems to want the generated index numbers before, rather than after, the filenames. So I think the format should be:

01_hound-of-the-baskerv.mp3
02_hound-of-the-baskerv.mp3
...

I’m assuming this format will work for other flash-memory-based non-iPod players out there, but that’s something I’d like to check. If you have one of those players I’d be curious to know it handles these feeds.

For the iPod and iTunes, a different hack was required. A podcast feed is not the natural format for a multi-chapter book. The software expects to display and play items in reverse chronological order. I thought that if everything had the same pubDate the secondary sort would ascend by name, but that didn’t seem to work. So in these feeds, the (arbitrary) pubDate decrements by seconds as the index counter increments. You wind up with a format like this:

file: 01_hound-of-the-baskerv.mp3 pubdate: Sat, 14 Apr 2007 05:00:15 -0000
file: 02_hound-of-the-baskerv.mp3 pubdate: Sat, 14 Apr 2007 05:00:14 -0000
...

Kinda hokey, but it seems to work for me, see what you think. I’d like to be able to give something back to LibriVox. I haven’t gotten around to recording any chapters, but maybe this will help the cause.


1 Forgot WMP’s not a podcatcher yet, alas.

17 Comments

  1. As long as your altering the stated pubDate for ordering purposes, you could also offer the ability to alter the actual date a chapter is added to the feed, so listeners could easily subscribe and recieve a chapter per day. I’ve done episodic RSS feeds in the past by adding the current date as a paramater to the RSS feed link, and then creating dynamic feeds with items based on an offset from this date.

  2. jon this looks great… tested all three with success in iTunes.

    a couple of notes:

    -the id3 tags of the mp3s seem to get changed in your feed compared with downloading straight…i suspect this is because of how iTunes treats podcasts

    -probably in our case the prefered feed format would be itpc:// rather than http://…in safari & firefox, (not sure about IE). then this would be a one-click-direct import into itunes. I have some problems with itunes/ipod (their closed podcast formats), but it’s definitely the defacto tool for podcasts, and ipods are still defacto portable player… and the objective here (our at least for librivox) is to help people get stuff into their players, without making them worry about concepts they don’t get (eg, “what do I do with this strange xml page?”)

  3. “you could also offer the ability to alter the actual date a chapter is added to the feed, so listeners could easily subscribe and receive a chapter per day.”

    Could do. I’m not sure that’s wanted, though. My guess is that most people will want to have all the chapters available at one go. It’d be interesting to hear from someone who does prefer the episodic experience, though.

  4. “the ability to alter the actual date a chapter is added to the feed”

    the good people over at http://podibooks.com offer you the choice of how often you get your chapters delivered – a customizable audiobook podcast schedule – and we have some books on their system. but i think the preference will likely be to get it all onto your own machine, and then decide when you wish to listen.

  5. Over in the LibriVox forum, kayray raises a key point:

    “I wonder if people who don’t know what to do with a zip will know what to do with a feed…”

    (http://librivox.org/forum/viewtopic.php?p=115204)

    I replied there but will echo it here because it’s of general interest. iTunes does not make its podcatching feature easy to discover outside the context of the iTunes Music Store.

    I whipped up this little screencast to show how you use that “advanced” feature:

    http://jonudell.net/movies/librivox.mov

    I bet there are a whole lot of folks who might want to make use of that feature but don’t know that it exists.

  6. “probably in our case the prefered feed format would be itpc:// rather than http://…in safari & firefox, (not sure about IE).”

    You are certainly right about that, Hugh. I just checked and IE works as well. Interesting point: I just tried to insert those versions of the feed URLs into this entry, and WordPress changed itpc back to http!

    You could of course also let FeedBurner handle the feeds, if you don’t mind taking a dependency on an external service. Though if you want to do a feed per book that’d be a lot of registration overhead.

    Incidentally I just checked my pre-ITConversations podcast feed:

    http://feeds.feedburner.com/JonUdellFridayPodcasts

    As I’d forgotten, it uses pcast:// rather than itpc:// to invoke iTunes. What a tangled web we weave.

  7. hmm just thinking more. ideally this would work as a scraper, workflow is:
    -scrapes the catalog page (or looks at the dir in internet archive)
    -pick out all files ending 64kb.mp3
    -generates a podcasty xml using all those 64kb.mp3 files (ignoring ogg and 128kp.mp3 files)
    -orders by date as you’ve done
    -then publishes a link using itpc:// on the same catalog page.

  8. LibriVox now provides a podcast feed, but it is only 64kbps… Just for kicks, here’s a Yahoo Pipe! that builds a podcast feed for the higher quality mp3 source.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s