This API uses the Amazon Polly text-to-speech service to deliver ultra fast conversions of text to speech (mp3 audio files or audio buffer streams).
{
"Text": "Hello this is a test", //max 3000 characters
"TextType": "text", // text or ssml
"OutputType": "stream", //options are file or stream
"VoiceId" : "Joanna", //any AWS Polly supported voice, defaults to Joanna
"LanguageCode": "en-GB" //the language for speech synthesis
}
{
"Text": "Mary had a little lamb Whose fleece was white as snow",
"TextType": "ssml",
"OutputType": "file",
"VoiceId": "Joanna",
"LanguageCode": "en-US"
}
Voice Types : https://docs.aws.amazon.com/polly/latest/dg/voicelist.html
SSML Tags: https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html
Language Codes: https://docs.aws.amazon.com/polly/latest/dg/SupportedLanguage.html
{
"statusCode": 200,
"body": {"data":{"type":"Buffer","data":[73,68,51,4,0,0,0,0,0,35,84,83,83,69,0,0,0,15,0,0,....35]}},
"headers": {
"Content-Type": "application/json"
}
}
Here the speech data is returned as a Buffer array. No file is generated or stored on the server.
{
"statusCode": 200,
"body": {"data":"https://ttsapi-538587107323-audiobucket.s3.us-east-1.amazonaws.com/b586047b-dbce-44a9-9463-51c416a3a6d2.mp3?....&x-id=GetObject"},
"headers": {
"Content-Type": "application/json"
}
}
Here an mp3 file generated and a pre-signed url for it is returned. This file generation process is asynchronous meaning the file might not be immediately available but usually would be available after a few seconds of query completion. The returned url is valid for 80mins. The generated file would is deleted from server after 1 day
HTML:
JS:
var dataArr = new Uint8Array(bstream) // bstream is API response where OutputType is stream
var blob = new Blob([dataArr.buffer]);
var url = URL.createObjectURL(blob);
var audio = document.getElementById(โmyplayerโ);
audio.src = url
JS:
var audio = document.getElementById(โmyplayerโ);
audio.src = s3Url //s3Url is API response where OutputType is file