Twenty years ago the idea that robots would one day write the news we read everyday fell in the domain of SF writers and madmen, but today it is a simple fact. Following in the footsteps of many other news organizations, including Forbes, the Associated Press announced today that they would soon be automating the reporting of the corporate earnings stories.
The AP sees this as a way to make better use of their trained journalists as journalists and not transcribers of statistical data. Lou Ferrara, the AP managing editor who oversees business news, explained the decision in a statement from the AP:
For many years, we have been spending a lot of time crunching numbers and rewriting information from companies to publish approximately 300 earnings reports each quarter. We discovered that automation technology, from a company called Automated Insights, paired with data from Zacks Investment Research, would allow us to automate short stories – 150 to 300 words — about the earnings of companies in roughly the same time that it took our reporters.
And instead of providing 300 stories manually, we can provide up to 4,400 automatically for companies throughout the United States each quarter.
The AP is partnering with a North Carolina company called Automated Insights. This firm, which just raised $5.5 million in a Series B funding round, has developed a patented natural language generation platform called Wordsmith. Like the many other bots working in journalism, Wordsmith spots patterns, correlations, and insights in large data sets and then describes them in plain English.
It's by no means the first bot to get a press card; perhaps the best known are the ones developed by the 4 year old startup Narrative Science, which says it counts many news organizations among its customers. There's also BookStats, which has been offering a similar sports-focused bot since at least 2010.
And in addition to the startups, a number of newspapers have been experimenting internally. The AP, for example, also reports that they "have been automating a good chunk of AP’s sports agate report for several years". And the LA Times has been using a couple bots to generate seeds for stories, including one focused on police reports and another called Quakebot.
The fact of the matter is, the bots are here to stay but they are not without their issues. A couple months ago one Twitterbot famously got a story wrong when it tweeted the story a little too quickly:
Created by Google engineer Thomas Steiner, Wikipedia Live Monitor is a news bot designed to detect breaking news events. It does this by listening to the velocity and concurrent edits across 287 language versions of Wikipedia. The theory is that if lots of people are editing Wikipedia pages in different languages about the same event and at the same time, then chances are something big and breaking is going on.
At 3:09 p.m. the bot recognized the apparent death of Quinton Ross (the basketball player) as a breaking news event—there had been eight edits by five editors in three languages. The bot sent a tweet. Twelve minutes later, the page’s information was corrected. But the bot remained silent. No correction. It had shared what it thought was breaking news, and that was that. Like any journalist, these bots can make mistakes.
Thus proving that no matter how much automation is added to the process, a person still needs to be in the loop.
While the number of bots in news organizations are increasing day by day, that's not where they got their start. The earliest text-generating bot I know of was released in 1983. Racter (short for raconteur) was intended to write stories but the tech just wasn't up to the task 30 years ago. Three decades later, that is no longer true.
I don't know when or if these bots will ever make their way into authoring, but I for one look forward to the day that they are cheap and easy to use. Far from being afraid that I will lose out, I know that I already use a lot of automation on a daily basis - and so do you. My RSS feeds are gathered automatically by BazQux, and much of the work in running my blog is automated in WordPress.
None of my current bots generate content, but I expect that they will one day.
image by striatic