The Biggest Plagiarism Scandal in the History of eBooks Slipped by Amazon Unnoticed

01.05.2019 Nate Hoffelder 10 Comments

When news broke a little over two months ago that the Brazilian ~~author~~ person Cristiane Serruya had copied parts of Courtney Milan’s book, The Duchess War, little did we know that this one example of plagiarism and copyright infringement would quickly snowball into what is now the biggest ebook news story of the year.

CopyPastCris, as the scandal has been dubbed, now includes no fewer than 95 books by 43 authors as well as articles and other content from six websites (and two recipes). Numerous passages have been copied from those books and websites into one or more of Serruya’s published works.

Yes, ninety-five books. You can find a running tally of the affected works on Caffeinated Fae, and you can find commentary on Pajiba and SBTB. And if you can stomach it, go read the interview where Serruya pins the blame on her untrustworthy ghostwriters. I wouldn’t beleive it too much, however, because first, the sheer number of examples of plagiarism, and second, a person who identified as one of Serruya’s ghostwriters left a comment on Courtney Milan’s blog disputing Serruya’s claims, and adding one claim, that she hadn’t been paid for her work:

I am a ghostwriter who worked with the person in question in 2017 and early 2018 on 2 books. I do not work on Fiverr but was contacted by her personally. I think I can provide insight to whether the ghostwriter was to blame, as she contends. Her work, when given to me, was a number of mishmashed scenes that needed “expanding”, as she said. I took for granted that these were her own words, and embellished as she requested, as this is how I work–I often help authors who are “too close” to their own book to get it in shape for publication. Now I can see that it’s very possible those were plagiarized scenes that she was hoping a ghostwriter would change enough to make unrecognizable. I did cut off ties with her after she gave me a sob story about her daughter being sick and told me she couldn’t pay me for work already done. I did not work on the above book, but just knowing the way she works, it seems much more possible to me that she cobbled scenes together via other people’s published works and gave them to a ghostwriter to smooth over…. than for a ghostwriter to be entirely responsible for this.

This story is bound to grow over time as more authors discover they too were plagiarized. A lawsuit has even been filed over CopyPastCris, which comes as a surprise. Few authors have the money or the energy to go after Serruya, but Nora Roberts has both. She filed suit last week over six of her works that were copied by Serruya.

It is curious that Roberts is only suing over 6 books, because according to Claire Ryan, Serruya copied from nine of Roberts’s works.

Ryan is a fantasy author, but in her day job she is a senior web programmer. She started asking herself how she would build a system to detect the plagiarism.

The answer became the core of what eventually turned into the algorithm – a program that could find similar text between two ebooks, even if the text had been paraphrased or the names changed.

There were limitations. Too much paraphrasing meant it wouldn’t recognize similarity, and it would probably come back with complete nonsense sometimes. But it just might work.

So on that Wednesday night, I started to write some code. It was just a PHP script, nothing special, but I had a feeling that it would work pretty well. Then I found a copy of The Duchess War on Smashwords, and after a few tweets, one of my followers sent me a link to a copy of Royal Love.

I did the first run on those two books, and the results looked pretty good.

Ryan went on to compare Serruya’s books and as many of the original copied books as she could get her hands on. Some were provided by the affected authors, while others were crowd-sourced from readers, and all were fed into Ryan’s algorithm.

While some of the plagiarism was spotted by readers and authors, much of the work to document the plagiarism was done by Ryan. She wrote the algorithm, she supplied the computer time to run it, and she double-checked the results.

Isn’t it funny how one programmer could find all this and Amazon did not?

I mean, Amazon sold all the ebooks in question, and it employs how many tens of thousands of software engineers, and yet they couldn’t find this massive example of plagiarism.

Yes, I know, Amazon sells millions of ebooks, but this kind of project doesn’t require infinite resources. It’s the kind of thing that startups like the late BookLamp can do with limited resources, so there’s no reason that Amazon, with its $11 billion in profit last fiscal year, couldn’t have found this issue if they were so inclined.

image by spiffie via Flickr

Comments

Jim Heskett May 1, 2019 um 2:03 pm

Dear God, I hope Amazon does not introduce an automated plagiarism check. Can you imagine how many books will be wrongly pulled down due to it?

Nate Hoffelder May 1, 2019 um 2:50 pm

There’s a difference between having a plagiarism checker and taking action on said checker. Yes, i agree Amazon would screw it up, but it is still possible to have the checker and a good policy for using it.

Richard Hershberger May 2, 2019 um 8:57 am

But is it possible to have a good policy that doesn’t involve a human who is paid enough to be empowered by Amazon to make a decision? The business model of the modern tech giant is to pay programmers to write algorithms, not customer service representatives to make judgment calls. The wackiness we see with random take-downs and the like is because the algorithms aren’t actually up to the task, and there is no one with both the responsibility and the authority to deal with the results.

I am with Jim Heskett. If Amazon introduced a plagiarism checker, the fallout from the false positives would be brutal. I have absolutely no faith in Amazon’s willingness to commit to the expense of dealing fairly with these false positives.

Nate Hoffelder May 2, 2019 um 9:16 am

I didn’t think of this until after I published the post, but what’s interesting about this is that Amazon did have a plagiarism checker of sorts. They have (or had) rules against recycling web content and against publishing PD books, and they scanned books to check the content:
https://the-digital-reader.com/2012/09/11/amazon-now-rejecting-gpl-licensed-ebooks/
https://the-digital-reader.com/2012/05/24/amazon-banning-junk-pd-ebooks-from-the-kindle-store-what-again/

Now that I recall, Amazon wasn’t actually able to enforce that policy very well.

Pfaithfull May 2, 2019 um 8:53 am

Amazon has an obligation as a book seller to ensure plagarism is flagged and managed because they KNOW they are profitting from it via the vanity press market (now called self-publishing) .

Mike Cane May 2, 2019 um 9:56 am

I should not laugh, but I can’t help it. It seems to me someone took — ahem — "inspiration" from Patricia Melo’s brilliant book, In Praise of Lies.

https://www.amazon.com/Praise-Lies-Patricia-Melo/dp/1582340587

>>Melissa becomes rather too interested in ["writer"] Jose’s dastardly plots–unashamedly plagiarized from the classics: Chesterton, Poe, Dostoevsky, his editor none the wiser

The novel is originally BRAZILIAN!

I really did wonder if this would happen.

BTW, Melo’s book is great. She’s a fantastic writer and the translations capture her spirited writing. HIGHLY recommended.

Lily May 2, 2019 um 9:05 pm

My good friend is an author and found out a year ago that a relatively famous author has been plagiarizing her work…to the point of not even changing names or locations. She’s discussed with lawyers and no one will help her. It’s sucks.

Richard Hershberger May 3, 2019 um 8:56 am

Just spitballing here, but did she register her work with the US Copyright Office? If not, then the plagiarist is only liable for actual damages, which likely are low and in any case difficult to prove. This would explain the lack of interest by lawyers. She could likely find one happy to take the case on an hourly basis, but she also likely can’t afford this. She needs someone to take it on a contingency basis, and any likely award is too small to merit this. If she registers her work with the Copyright Office, then the rules change. She would be eligible for statutory damages, which would be substantially higher. It is probably too late to fix what has already happened, but she can prepare with her future work.

Marilynn Byerly May 3, 2019 um 11:21 am

That was my thought.

Writer friends discovered that people on eBay were selling collections of copyrighted ebooks without the author permissions. When the authors complained, THEY were banned from eBay, not the illegal sellers. How dare they interfere with eBay’s profits!

Posy Mauve August 27, 2020 um 2:42 am

My book, Annaldra, has recently been stolen of Inlitt’s site and published word for word on Amazon Kindle. The thief didn’t even bother to change the title, which in itself is unique.

Write a Comment
Cancel reply

You must be logged in to post a comment.

The Biggest Plagiarism Scandal in the History of eBooks Slipped by Amazon Unnoticed

Morning Coffee – 7 November 2018

How Numbers Lie: Nexus 7 Passes iPad in Holiday Sales in Japan

Blio app now available for iPhone, iPad

Write a Comment Cancel reply

Write a Comment
Cancel reply