Text-to-Speech API
Prosa Text-to-Speech (TTS) API provides a way for you to create synthetic speech. Our TTS API is able to convert your articles, scripts, and dialogues into audio files, then you can embed them directly to your applications, websites, or multimedia contents. We support common audio formats, and you may choose your own synthesis method: synchoronously or asynchrounously.
Use Cases
There are 2 groups of use cases which are supported by the Prosa TTS API:
-
Instantaneous Synthesis
If you need to create short audio files from some input text immediately, you can send the text to the API and wait for the audio to be returned. Our engine will synthesize the audio and send them to you synchronously. Use cases which may involve instantaneous systhesis are: voice response in telephony systems, question answering by virtual assistants, etc.
-
Batch Synthesis
If you have collections of texts (e.g. articles, book chapters, news, etc.) to be synthesized, that you want to use later, you can submit the texts to the API asynchronously. Our system will schedule them to be synthesized and you can retrieve the audio files later. Use cases which may involve batch synthesis are: creating audiobooks, creating audible news & articles, etc.
Synthesis Methods
To support those use cases, Prosa TTS API provides 2 synthesis methods:
-
Clients send the text through the REST API. The
wait
field in the request body must be set totrue
. Clients then wait for the synthesis process to finish, then get the audio data/audio URL immediately. The text for each request must not exceed 280 characters. -
Clients send the text through the REST API, with the
wait
field in the request body set tofalse
. After submitting the request, clients receive the TTS job details, including the job ID. Using the job ID, clients can check the synthesis progress and result. Clients can submit up to 5000 characters for each synthesis request.