Scribd, Piracy, and Why You Can’t Always Believe What You Read Online
When Scribd launched their ebook subscription service last Fall many in the book co0mmunity faulted Scribd for their past history of piracy issues, and it seems that Scribd still hasn’t managed to escape that specter.
Rich Meyer, writing over at indies Unlimited, is advising authors to pull their Smashwords titles from Scribd because users might pirate them. (The fact that Smashwords sells the same titles as easily piratable DRM-free ebooks seems to have escaped him).
In a post that mixes equal parts factual errors, fear, and a misunderstanding of the tech involved, Rich writes:
I really had no problem with the service, as long as it was limited to the iPhones and Google Android stuff. I don’t have a smartphone myself, and don’t care to get one. People aren’t going to go out of their way to monkey with that sort of stuff, so they’d be using it and poof, book gone. But Scribd now has the ability to be used on a Kindle Fire, and that’s a whole different kettle of fish when it comes to illegal access and piracy.
The problem with Scribd’s view on piracy is that they are working on the WRONG bloody end of the stick. Pirates aren’t going to be UPLOADING books – they’re going to be DOWNLOADING CONTENT. I just can’t believe they’ve not figured this out: Any subscription service for e-books is basically a smorgasbord for piracy. There is absolutely nothing I can see that stops a person from just downloading books wantonly and copying them to another source (laptop, memory card, thumbdrive) and then cracking the pointless DRM on them and having a field day with them. Believe me, it’s very, very easy, and once you know how, you can set up your system to do it automatically – all you have to do is drop your book into a folder and *POOF* it’s as free as a bird. And if they didn’t bother to even open the books before they return them, the author gets ZILCH.
As Juli Monroe pointed out to me this morning, that bolded section is simply not true. I had linked to Rich’s post in the morning coffee post, and Juli called me on it because she didn’t think the section quoted above was accurate.
She had already looked for the ebooks she had downloaded from Scribd, and she couldn’t find them (not as ebooks, per se). She’s not interested in pirating the ebooks, obviously, but she was curious about the technical details and went looking. If she can’t find a recognizable ebook file, do you really think it will be easy for the average user to strip the DRM?
I don’t think so, and so far as I know there’s no easy DRM stripping tool for Scribd. But just to sate my curiosity I looked into the matter myself. I didn’t currently have a Scribd account, so I took advantage of the free trial. After I downloaded a few ebooks, I went looking and eventually found what I think is the correct folder.
Juli and I both think that Scribd stores their ebooks in a folder called documents_cache. We came to this conclusion independently, and if that is where Scribd puts the ebooks then I seriously doubt the average user will be able to strip the DRM.
The ebooks aren’t stored as ebooks; instead they are stored as collections of JSON, CSS, and image files. And while I can’t speak for the JSON files, the image files have DRM of some kind.
I wouldn’t know where to start to convert this format – but I can guess. Scribd’s ebook format may or may not have something to do with the HPub standard. That is one of the lesser known ebook file formats, and it can be used for sending rich format ebooks over the web. The spec mentions JSON files, but it also mentions certain requirements not met by the Scribd psuedo-ebook format.
I won’t go into the full details here; if you’re interested you can check yourself. but the short version is that I wouldn’t worry too much about someone stripping the DRM from a Scribd ebook; it’s going to take a real hacker to pull it off, not your average user.