How to Get ALL of Your Data Out of Google Reader

365.45: Remember these?How would you like to download a copy of all of the posts from all the feeds you used to follow in Google Reader? As crazy as that may sound, it's not outside the realm of possibility.

If you're reading this post then you are probably like me, mourning the imminent Readerpocalypse. By this point I would bet that you have already used Google Takeout to export your data and move to another service (I am going to Feedly FTTB), and many have probably minimized the Google Reader tab for the final time.

As useful as Google Takeout may have been, it didn't actually get all of the available data. It turns out that  Google Reader archived a copy of every post from every feed followed by a user, and in some cases GR has archived a feed going as far back as when it launched in 2005.

And now you can download that data. It's going to take a little work and a little tech savvy to get the rest of your data, but it is still possible. A former Google Reader developer by the name of Mihai Parparita has written a few Python scripts that will everything from your Google Reader accounts:

Reader has Takeout support, but it's incomplete. I've therefore built the reader_archive tool that dumps everything related to your account in Reader via the "API". This means every read item3, every tagged item, every comment, every like, every bundle, etc. There's also a companion site at that explains how to use the tool, provides pointers to the archive format and collects related tools4.

Additionally, Reader is for better or worse the papersite of record for public feed content on the internet. Beyond my 545 subscriptions, there are millions of feeds whose histories are best preserved in Reader. Thankfully, ArchiveTeam has stepped up. I've also provided a feed_archive tool that lets you dump Reader's full history for feeds for your own use.5

Here's what is meant by everything:

  • All your read items
  • All your starred items
  • All your tagged items
  • All your shared items
  • All the shared items from the people you were following.
  • All the comments on shared items
  • All your liked items
  • All items you've kept unread, emailed, read on your phone, clicked on or otherwise interacted with.
  • All items that have appeared in one of your subscriptions
  • All items that were recommended to you
  • All items in the (English) "Explore" section
  • All the profiles of the people you were following before the sharepocalypse.
  • All your preferences.

We're talking about GBs of data here, but given the size of the average hard disk these days that isn't such a big deal. Downloading could be an issue, though.

If you're wondering why this would be useful, please allow this level 42 digital pack rat to explain.

Not only will this enable you to save a copy just in case you might need it again, it's also an insurance policy against the demise of a useful blog. And it will give you a clean copy of the posts from blogs that don't bother to maintain an archive. You'd be surprised how often I have looked for posts from as recent as 2010 only to find that some major gadget blogs don't bother to keep an archive.

And to top it off, one of the Python scripts will enable you to browse the content you downloaded. I'm not sure how well it works yet but it has to be better than trying to search manually.

For more detail, and to download the scripts, check out the website Reader is Dead.


image by WordRidden

About Nate Hoffelder (11473 Articles)
Nate Hoffelder is the founder and editor of The Digital Reader: "I've been into reading ebooks since forever, but I only got my first ereader in July 2007. Everything quickly spiraled out of control from there. Before I started this blog in January 2010 I covered ebooks, ebook readers, and digital publishing for about 2 years as a part of MobileRead Forums. It's a great community, and being a member is a joy. But I thought I could make something out of how I covered the news for MobileRead, so I started this blog."

Leave a comment

Your email address will not be published.