Open AI Text to Speech

FREEMIUM
Popularity

9.3 / 10

Latency

3,711ms

Service Level

94%

Health Check

100%

README

Introduction

The Audio API provides a text-to-speech endpoint, speech, based on our TTS (text-to-speech) model. It comes with 6 build voices and can be used to:

Narrate a written blog post
Produce spoken audio in multiple languages
Give real-time audio output using streaming
Here is an example of the alloy voice:

Click to see Sample

Quick start

The speech endpoint takes in three key inputs: the model name, the text that should be turned into audio, and the voice to be used for the audio generation. A simple request would look like the following:

{
    "model": "tts-1",
    "input": "Today is a wonderful day to build something people love!",
    "voice": "alloy"
  }

Voice options

Experiment with different voices (alloy, echo, fable, onyx, nova, and shimmer) to find one that matches your desired tone and audience. The current voices are optimized for English.

- Alloy
- Echo
- Fable
- Onyx
- Nova
- Shimmer

You can see samples for each voice in the following list:

Supported output formats

The default response format is “mp3”, but other formats like “opus”, “aac”, or “flac” are available.

  • Opus: For internet streaming and communication, low latency.
  • AAC: For digital audio compression, preferred by YouTube, Android, iOS.
  • FLAC: For lossless audio compression, favored by audio enthusiasts for archiving.

Supported languages

The TTS model generally follows the Whisper model in terms of language support. Whisper supports the following languages and performs well despite the current voices being optimized for English:

Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Marathi, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh.

You can generate spoken audio in these languages by providing the input text in the language of your choice.

Followers: 3
Resources:
Terms of use
API Creator:
Rapid account: Swift API
Swift API
swift-api
Log In to Rate API
Rating: 3.7 - Votes: 3