Amazon Polly Could Lead to Better TTS in eBook Apps

476998965_a13c0022c6_bAmazon announced several new AI features for AWS  yesterday, including  Amazon Polly, a text to speech platform:

Amazon Polly makes it easy for developers to add natural-sounding speech capabilities to existing applications like newsreaders and e-learning platforms, or create entirely new categories of speech-enabled products – from mobile apps to devices and appliances. Amazon Polly is easy to use; developers can send text to Amazon Polly using the SDK or from within the AWS Management Console and Polly immediately returns an audio stream that can be played directly or stored in a standard audio file format. With 47 lifelike voices and support for 24 languages, developers can choose from both male and female voices with a variety of accents to make applications for users around the globe. And Amazon Polly’s fluid pronunciation of text content means applications deliver high-quality voice output across a wide variety of text formats. Amazon Polly is scalable, returning high-quality speech fast, even when converting large volumes of text to speech. With Amazon Polly, developers pay only for the text they convert, and they can cache generated speech and replay it as many times as they like with no restrictions.

Amazon Polly is the other reason Amazon bought Ivona three years ago. In addition to using Ivona's tech for the text-to-speech voices on the Fire tablets and for the Alexa virtual assistant, Amazon has developed a platform for third-party developers.

Edit: It's worth noting, though, that this feature is not free - only very cheap when compared to pre-recorded audio. Amazon revealed on the AWS blog that:

You can use Polly to process 5 million characters per month at no charge. After that, you pay $0.000004 per character, or about $0.004 per minute of generated audio. That works out to about $0.018 for this blog post, or around $2.40 for the full text of Adventures of Huckleberry Finn.

In any case, thanks to Amazon's efforts, when Skynet arrives it will be able to ask politely for your cooperation.

image by crimfants

About Nate Hoffelder (11579 Articles)
Nate Hoffelder is the founder and editor of The Digital Reader:"I've been into reading ebooks since forever, but I only got my first ereader in July 2007. Everything quickly spiraled out of control from there. Before I started this blog in January 2010 I covered ebooks, ebook readers, and digital publishing for about 2 years as a part of MobileRead Forums. It's a great community, and being a member is a joy. But I thought I could make something out of how I covered the news for MobileRead, so I started this blog."

1 Comment on Amazon Polly Could Lead to Better TTS in eBook Apps

  1. The two interesting aspects, with respect to the competitors, are:

    1. it is extremely cheaper;
    2. the terms of use allow caching*.

    I think this service could replace services like ReadSpeaker (on-demand TTS for the Web) and enable TTS for apps.

    Despite what Amazon/Ivona (or, for the matter, any other TTS maker) claims, current TTS engines all have just basic prosody support, which is still not comparable to human narrators. In other words, TTS works for reading news or maybe non-fiction, but audiobook aficionados will hardly embrace it for literature.

    For sure, this is a good stir to the TTS industry, and for accessibility in general.

Leave a comment

Your email address will not be published.


*