Calibre is a widely used ebook library tool that can convert ebooks to and from just about any format and help you load said ebook on to just about any device. One of its lesser known features is the ability to act as a content server either on a local network or on the web.
I've tested using calibre as an online content server a few times, but I never left it running because I didn't want to find out the hard way that a hacker could exploit calibre and gain access to the rest of my files on my system. But according to one computer security expert, around 1,800 other people aren't as cautious.
I love reading, and I especially like my e-readers. They allow you to carry and travel with hundreds of books. Calibre is an open source e-book management application, and probably one of the most popular. It's capable of running a server to allow remote users to browse and download books. Knowing this and being a pentester by trade, I became quite curious if there was any notable presence of Calibre on the internet. In it's default configuration, Calibre does not require any authentication to access the web interface. Using Shodan.io, we can search for the keyword Calibre in the server HTTP header.
They did indeed find calibre content servers running on the open web; according to the results page on Shodan, there are around 1,863 instances running right now. (I say "around" because some results are false positives.)
We can see all sorts of interesting details from the Shodan results. For examples, the IP addresses for these content servers are concentrated in the biggest ebook markets (US, UK, Germany, France), and the most commonly used OSes were Windows 7, Windows 8, and a really old version of Windows 7.
Or so the results suggest; most of the data is behind a paywall.
One thing I did notice is that a number of the calibre content servers that I could access were completely open. They didn't require a password or any other type of authentication. (Other content servers did require you to know the password.) A number of the unprotected content servers had copies of copyrighted ebooks, making them de facto pirate servers (essentially mini Internet Archives).
To be clear, these open content servers are not easy to find - you have to know a little about the web and know about Shodan or you may never find them. But once you learn those two nuggets of info you will have access to a large and illicit library of ebooks.
My source noted that they could identify over 10,000 titles from the server manifests alone:
Of the original 1,800 or so servers from Shodan, we were able to download the manifest file from 225 Calibre servers. Note this doesn't include unauthenticated servers which don't offer the manifest file. I didn't write a crawler to parse individual titles and requesting potentially 100s of pages from a single host.
From the 225 Calibre servers, I was able to identify about 10,000 unique titles. Some interesting observations:
- Ironically, there's a number of "cybersecurity" titles.
- I tried searching through the titles for sensitive documents such as "receipt" or "invoice" or "tax". Nothing.
- Unsurprisingly given world demographics, a large number of titles are not English. This might have hamstringed my manual analysis.
If you are running a calibre content server, you might want to check the security just in case.