The AP is the Latest to Join the Robot Reporter Brigade

1276092_637efb8c6b_o[1]Twenty years ago the idea that robots would one day write the news we read everyday fell in the domain of SF writers and madmen, but today it is a simple fact. Following in the footsteps of many other news organizations, including Forbes, the Associated Press announced today that they would soon be automating the reporting of the corporate earnings stories.

The AP sees this as a way to make better use of their trained journalists as journalists and not transcribers of statistical data. Lou Ferrara, the AP managing editor who oversees business news, explained the decision in a statement from the AP:

For many years, we have been spending a lot of time crunching numbers and rewriting information from companies to publish approximately 300 earnings reports each quarter. We discovered that automation technology, from a company called Automated Insights, paired with data from Zacks Investment Research, would allow us to automate short stories – 150 to 300 words — about the earnings of companies in roughly the same time that it took our reporters.

And instead of providing 300 stories manually, we can provide up to 4,400 automatically for companies throughout the United States each quarter.

The AP is partnering with a North Carolina company called Automated Insights. This firm, which just raised $5.5 million in a Series B funding round, has developed a patented natural language generation platform called Wordsmith. Like the many other bots working in journalism, Wordsmith spots patterns, correlations, and insights in large data sets and then describes them in plain English.

It's by no means the first bot to get a press card; perhaps the best known are the ones developed by the 4 year old startup Narrative Science, which says it counts many news organizations among its customers. There's also BookStats, which has been offering a similar sports-focused bot since at least 2010.

And in addition to the startups, a number of newspapers have been experimenting internally. The AP, for example, also reports that they "have been automating a good chunk of AP’s sports agate report for several years". And the LA Times has been using a couple bots to generate seeds for stories, including one focused on police reports and another called Quakebot.

The fact of the matter is, the bots are here to stay but they are not without their issues. A couple months ago one Twitterbot famously got a story wrong when it tweeted the story a little too quickly:

Created by Google engineer Thomas Steiner, Wikipedia Live Monitor is a news bot designed to detect breaking news events. It does this by listening to the velocity and concurrent edits across 287 language versions of Wikipedia. The theory is that if lots of people are editing Wikipedia pages in different languages about the same event and at the same time, then chances are something big and breaking is going on.

At 3:09 p.m. the bot recognized the apparent death of Quinton Ross (the basketball player) as a breaking news event—there had been eight edits by five editors in three languages. The bot sent a tweet. Twelve minutes later, the page’s information was corrected. But the bot remained silent. No correction. It had shared what it thought was breaking news, and that was that. Like any journalist, these bots can make mistakes.

Thus proving that no matter how much automation is added to the process, a person still needs to be in the loop.

While the number of bots in news organizations are increasing day by day, that's not where they got their start. The earliest text-generating bot I know of was released in 1983. Racter (short for raconteur) was intended to write stories but the tech just wasn't up to the task 30 years ago. Three decades later, that is no longer true.

I don't know when or if these bots will ever make their way into authoring, but I for one look forward to the day that they are cheap and easy to use. Far from being afraid that I will lose out, I know that I already use a lot of automation on a daily basis - and so do you. My RSS feeds are gathered automatically by BazQux, and much of the work in running my blog is automated in WordPress.

None of my current bots generate content, but I expect that they will one day.

image by striatic

About Nate Hoffelder (11479 Articles)
Nate Hoffelder is the founder and editor of The Digital Reader: "I've been into reading ebooks since forever, but I only got my first ereader in July 2007. Everything quickly spiraled out of control from there. Before I started this blog in January 2010 I covered ebooks, ebook readers, and digital publishing for about 2 years as a part of MobileRead Forums. It's a great community, and being a member is a joy. But I thought I could make something out of how I covered the news for MobileRead, so I started this blog."

9 Comments on The AP is the Latest to Join the Robot Reporter Brigade

  1. Quakebot is less robot reporter, more public service. Using USGS data, it instantly feeds a quake’s preliminary magnitude and location – something that would take time even with the fastest journalist. It doesn’t replace reporting and editing but speaking as a former Angeleno, it’s a valuable community information source.

  2. So do you think that if they ever get to the point where they are used for actual authoring, especially of longer fictional pieces like novels (of whatever genre), that human authors will go the way of the artisans? In other words that they won’t disappear but that they’ll be somewhat rare, but still be there to be found if one looks hard enough? Hey, it could even become a novelty factor: “New X movie coming out, by the HUMAN writer John Blah!”

    • Only for the worst written stories. While it is possible to compute the mechanics of writing, the art of writing rises above the simple mechanics and adds an element of creativity that computers lack.

      But even though I don’t see authors being replaced I do see them using these bots as tools. Once they get smart enough the bots could be useful at fixing grammar, finding plot holes and conflicting details in a story, and keeping track of continuity in both a single story and a series.

      • I agree on the creativity front. I had this one really smart classmate who wrote some story for one of her classes from from a mechanial standpoint. I thought it lacked, as they say, “soul”.

        I hadn’t thought about that, using them as cheap, low-level editors. Kind of like CAT for translators (just post-work rather than pre).

        • Now that I think about it, I bet this kind of tech is going to be showing up in Scrivener and other author tools. It is already kinda there in MS Word; you just have to program it yourself.

          • I actually hate the grammar tool in Microsoft Office. More often than not it tells me to “correct” things that are correct (although it does occasionally help me with punctuation) so it’s more of a hassle than it’s worth for me.

            I also think that the functions of finding plot holes and contradicting details are a long ways off. For that in a literary text it would have to be able to TRULY understand what the text says. Something which we know computers aren’t capable of whenever we see what wholly computer generated translations like those in Google (be it from Englih to Chinese or vice versa or something “simpler” like between English and Spanish).

            After all, it would have to be able to truly understand in order to not mark flashbacks as continuity errors (can you imagine a flashing warning “DEAD CHARACTER! DEAD CHARACTER!” while the author is writing a flashback to express another character’s sense of loss? Personally I’d feel like drop-kicking the device out the window.

      • Shhh…don’t say that too loud. Editors may not like it. I’ve actually thought this is a pretty obvious advancement that I’m kind of surprised hasn’t happened already. I suspect most people’s experience being with the trainwreck that is MS Word’s grammar check may have soured many on the concept. Someone will do it right though, and probably soon. I expect pretty much everything in the process will become automated except for original creative expression, bringing the costs in time and treasure of publishing down even further. I look at the supplemental publishing services like I looked at the plethora of office workers everywhere 15 years ago. There’s no way 90% of them are employed a couple decades from now.

  3. The best they can do, to give journalism an edge, is to make this bot a tool that gathers and aggregates data. Then let the reporter create his own rulesets to extract data.
    Programmers, specialists, would know how to write those things easy, top to bottom, but wouldn’t know what to look for in the first place.

    There were some articles a while ago, in favor of teaching programming to journalists, I always thought this is what they were thinking of.

    Think of it as an extension of their Google-Fu. A lot of collected and sorted data, just waiting for a human to filter them to a human-readable level.

3 Trackbacks & Pingbacks

  1. Narrative Science raises $10 million, Fires its robot reporters - The Digital Reader
  2. NPR Pits Robot Reporter Against Journalist, When They Should Have Had Them Work Together | Ink, Bits, & Pixels
  3. It's Been a Year and a Day Since Google Reader Shutdown. Has Anything Changed? | The Digital Reader

Leave a comment

Your email address will not be published.