Scribd has a bad reputation (from its early days as a document hosting site) for being a piracy haven, and to combat piracy Scribd has adopted an automated that checks user uploads for pirated content.
Unfortunately that system, like a similar system at Youtube, works a little too well. There are several reports over on the Smashwords blog from Scribd users whose documents were removed from the site by Scribd’s automated system.
When Smashwords signed up to distribute ebooks to Scribd’s ebook subscription service last Fall, one of the side effects of the deal was that ebooks from Smashwords were regarded by Scribd’s system as being an original source.
This has led to more than a few problems. Whenever an author quoted a court document, public domain work, or other legitimately copyright-free document in their book, Scribd logs the quoted text as belonging to the author and their automated system flags and removes any user-uploaded documents that contain the same text.
Several complaints have surfaced on the Smashwords blog over the past couple weeks. For example:
I have a slightly different purview on this whole subject. Many of my document posts on Scribd are 1)historical documents, long out of copyright protection, 2) government or legal documents, not protected by copyright laws, 3) public domain documents in which the author has granted free copy rights to all. I find many times these documents will get taken down by Scribd, and I suspect that is because some author has included quotes from the documents within their copyrighted books. So, even in the world of copyright protection, there are improvements that must be considered. Not everything is black and white.
This type of criticism should sound familiar to anyone who follows copyright news, an in particular Youtube, whose ContentID system is often the focal point of complaints from both media companies and uploaders.
ContentID has had its own share of mistakenly removed videos, and in fact this kind of error gets news coverage about once every other month. The most recent one to cross my news feeds was a report about a video of US Congressional committee hearings being removed as a result of copyright claims by Telemundo and Univision. And only a few months before that, Youtube started taking down user uploaded gameplay videos en masse, even though the videos were arguably fair use and were often encouraged by the game developers.
And those are just the most recent cases; I have been reading about similar issues for over 4 years.
Today’s news about Scribd has only reinforced this blogger’s negative opinion of automated systems like ContentID. They make it far too easy for legit content to be blithely removed without involving even a single person in the decision.