How to Get ALL of Your Data Out of Google Reader

29.06.2013 Nate Hoffelder No Comments

How would you like to download a copy of all of the posts from all the feeds you used to follow in Google Reader?

As crazy as that may sound, it’s not outside the realm of possibility.

If you’re reading this post then you are probably like me, mourning the imminent Readerpocalypse. By this point I would bet that you have already used Google Takeout to export your data and move to another service (I am going to Feedly FTTB), and many have probably minimized the Google Reader tab for the final time.

As useful as Google Takeout may have been, it didn’t actually get all of the available data. It turns out that Google Reader archived a copy of every post from every feed followed by a user, and in some cases GR has archived a feed going as far back as when it launched in 2005.

And now you can download that data. It’s going to take a little work and a little tech savvy to get the rest of your data, but it is still possible. A former Google Reader developer by the name of Mihai Parparita has written a few Python scripts that will everything from your Google Reader accounts:

Reader has Takeout support, but it’s incomplete. I’ve therefore built the reader_archive tool that dumps everything related to your account in Reader via the "API". This means every read item³, every tagged item, every comment, every like, every bundle, etc. There’s also a companion site at readerisdead.com that explains how to use the tool, provides pointers to the archive format and collects related tools⁴.

Additionally, Reader is for better or worse the papersite of record for public feed content on the internet. Beyond my 545 subscriptions, there are millions of feeds whose histories are best preserved in Reader. Thankfully, ArchiveTeam has stepped up. I’ve also provided a feed_archive tool that lets you dump Reader’s full history for feeds for your own use.⁵

Here’s what is meant by everything:

All your read items
All your starred items
All your tagged items
All your shared items
All the shared items from the people you were following.
All the comments on shared items
All your liked items
All items you’ve kept unread, emailed, read on your phone, clicked on or otherwise interacted with.
All items that have appeared in one of your subscriptions
All items that were recommended to you
All items in the (English) "Explore" section
All the profiles of the people you were following before the sharepocalypse.
All your preferences.

We’re talking about GBs of data here, but given the size of the average hard disk these days that isn’t such a big deal. Downloading could be an issue, though.

If you’re wondering why this would be useful, please allow this level 42 digital pack rat to explain.

Not only will this enable you to save a copy just in case you might need it again, it’s also an insurance policy against the demise of a useful blog. And it will give you a clean copy of the posts from blogs that don’t bother to maintain an archive. You’d be surprised how often I have looked for posts from as recent as 2010 only to find that some major gadget blogs don’t bother to keep an archive.

And to top it off, one of the Python scripts will enable you to browse the content you downloaded. I’m not sure how well it works yet but it has to be better than trying to search manually.

For more detail, and to download the scripts, check out the website Reader is Dead.

source

image by WordRidden