Wikipedia, TurnItIn Team Up on Copyright

Kiwix_on_AndroidTurnitin has long used Wikipedia as a data source when scouring students' papers for plagiarism, and now it is returning the favor. eSchoolNews reports that TurnItIn and Wikipedia have collaborated on a new bot which is designed to identify text in Wikipedia articles which had been copied from elsewhere.

Launched in April 2015, EranBot draws on Turnitin's archive of millions of papers, including academic publications and journals, and scans each new edit or addition to Wikipedia for hints of copied text. Should it find a questionable edit, the bot flags the edit so it can be double-checked by a human editor.

"As an openly licensed free encyclopedia, Wikipedia respects copyright the same way traditional publishers do," said Jake Orlowitz, head of the Wikipedia Library, the program dedicated to helping editors access reliable sources to improve Wikipedia. "Turnitin now gives us access to a more sophisticated system for flagging potential copyright violations."

And they mean it. EranBot is only the latest tool Wikipedia editors use to check for plagiarism and copyright infringement. Wikipedia also has its own duplication detector, as well as more focused piracy detection tools that check a single article vs the web, or give the gimlet eye to the contributions of a single editor who is suspected of infringement.

Wikipedia also already had an autonomous bot like EranBot; it was called CorenSearchBot (and later, MadmanBot). According to the Wikipedia entry it is less sophisticated than Eranbot.

Turnitin boasts that their bot is more capable, and that it has the unique ability to learn (so it will only become more accurate over time). According to the press release, EranBot checks thousands of new edits every day. Around 100 edits are flagged for Wikipedia editors to review.

In addition to scanning Wikipedia, Turnitin is also working with the Wiki Education Foundation to check edits made by students participating in Wiki Ed’s Classroom Program. Unlike EranBot, which was built to detect infringement, this project tries to teach students the difference between  citation and quotation, and how to appropriately paraphrase and use source material.

image via Wikimedia Commons

About Nate Hoffelder (11467 Articles)
Nate Hoffelder is the founder and editor of The Digital Reader: "I've been into reading ebooks since forever, but I only got my first ereader in July 2007. Everything quickly spiraled out of control from there. Before I started this blog in January 2010 I covered ebooks, ebook readers, and digital publishing for about 2 years as a part of MobileRead Forums. It's a great community, and being a member is a joy. But I thought I could make something out of how I covered the news for MobileRead, so I started this blog."

Leave a comment

Your email address will not be published.