Today I ran across a story entitled “Amazon Pulls Thousands of E-Books in Dispute”, from a blog entitled “ebook-reader-vergleich.org”. (“Vergleich” is German for “comparison”.) The story talked about Amazon pulling thousands of independently-published books from the Independent Publishing Group over a contract dispute—something that, since I was watching when it happened, I know actually took place months ago, and was resolved a couple of months later. But here is Zite, presenting it as new news.
The reason, of course, is that this ebook-reader-vergleich blog is simply a plagiarist content farm, scooping up articles and using them as search-engine optimization link fodder. (Which is why I’m not linking directly to the site here.) In this case, it snagged a six-month-old Associated Press article—but is presenting it as new news. The dateline at the bottom specifically says August 21st, 2012. And whatever algorithms Zite uses for assigning relevance and importance to articles twigged onto this one.
If I hadn’t known about this story being old, I might have been tempted to believe it was new, and blog it accordingly. I have blogged old stories from Zite as new in the past, in fact, not knowing they were old—not so much because they were from content farms, but because they were posted on their original source a bit more than one, two, or more years ago but without the year in their dateline: for example, a post dated “August 21” instead of “August 21, 2011”—so Zite assumed they were recent.
A human might have caught these glitches, or noticed that “vergleich” article’s source had very little credibility. But Zite’s algorithm doesn’t have that. So it’s worth remembering that when we place our trust in completely automatic news aggregators, we open the door for false positives. We need to bear this in mind all the more as more of these aggregators pop up and become the go-to sources for online news reading.