STT Streaming

Speech To Text streaming service using websocket endpoint.

Security Requirements

Type	In	Name	Scheme	Format	Description
httpApiKey	header	x-api-key	-	-	API Key received from Prosa API Console

Endpoint

1	`wss://s-api.prosa.ai/v2/speech/stt`

Publish Operation

Server may return one of the following messages

Client may send one of the following messages to the server

ApiKey

If passing the api-key as HTTP header is not feasible, it is instead expected to be sent as the first message.

Payload

Name	Type	Optional	Description
token	string	true	API Key received from Prosa API Console

Example

{
  "token": "string"
}

Configuration

The configuration to run with. This message is sent initially after authentication to configure the transcription process.

Payload

Name	Type	Optional	Description
label	string	true	The label to give to this transcript.
model	string	false	The model to use.
audio		true	Describes the incoming audio. This is optional as the format of the audio is generally detected automatically.
audio.format	string	false	The audio format.
audio.channels	integer	true	The number of audio channels.
audio.sample_rate	integer	true	The sample rate of the audio.
include_filler	boolean	true	Include filler in transcription result.
include_partial	boolean	true	Whether or not to receive only final transcription or partial transcription as well.

Example

{
  "label": null,
  "model": "stt-general-online",
  "audio": {
    "format": "wav",
    "channels": 1,
    "sample_rate": 16000
  },
  "include_filler": false,
  "include_partial": true
}

AudioData

The audio data to transcribe. The audio data are sent as bytes. The audio header is expected to only be present in the first chunk. An empty byte is expected at the end of audio stream.

Publish Operation

Server may return one of the following messages

TranscriptionStart

Signifies that the transcription is ready to accept Audio Data.

Payload

Name	Type	Optional	Description
type	string	false	`created`
id	string	false	Id of the transcription. Use this id to refer to this transcription on another operation

Example

{
  "type": "created",
  "id": "string"
}

TranscriptionStatus

Status of the ongoing transcription process.

Payload

Name	Type	Optional	Description
type	string	false	`status`
status	string	false	Status of the transcription progress.

Example

{
  "type": "status",
  "status": "created"
}

PartialTranscript

Partial transcript of the ongoing speech.

Payload

Name	Type	Optional	Description
type	string	false	`partial`
transcript	string	false	The partial transcription.

Example

{
  "type": "partial",
  "transcript": "string"
}

FinalTranscript

Final transcript of a speech segment.

Payload

Name	Type	Optional	Description
type	string	false	`result`
transcript	string	false	The final transcription of a specific segment.
time_start	number	false	Relative timestamp from the start of the audio.
time_end	number	false	Relative timestamp from the start of the audio.

Example

{
  "type": "result",
  "transcript": "string",
  "time_start": 0,
  "time_end": 0
}

Metadata

Metadata of the elapsed transcription process.

Payload

Name	Type	Optional	Description
type	string	false	`metadata`
duration	number	false	The total duration of the audio.
quota_used	integer	false	The total quota used for this transcription session.
max_reached	boolean	false	Whether or not the process is stopped abruptly because the maximum duration has been reached.
max_duration	number	false	The maximum duration of a streaming that is allowed.

Example

{
  "type": "metadata",
  "duration": 0,
  "quota_used": 0,
  "max_reached": true,
  "max_duration": 0
}

QuotaAlert

An alert sent when you have run out of quota in the middle of transcription process. The transcription process is stopped and audio additional audio sent is not processed.

Payload

Name	Type	Optional	Description
type	string	false	`quota`
active	boolean	false	Whether or not the quota is still active.
timestamp	number	false	The relative timestamp from the start of audio in which the quota ran out.
quota_used	integer	false	The total quota used for this transcription session.

Example

{
  "type": "quota",
  "active": false,
  "timestamp": 0,
  "quota_used": 0
}

Error

Error occurred.

Payload

Name	Type	Optional	Description
type	string	false	`error`
message	string	false	The message of the error

Example

{
  "type": "error",
  "message": "Invalid audio configuration."
}

Websocket Close Codes

The websocket close codes contains information of its cause.

Close Code	Description
1000	Success
1006	Uncaught Internal Error
4000	Invalid Auth
4001	Invalid Session Config
4002	Invalid Model
4005	Insufficient Quota
4029	Rate Limited
4500	Internal Error

STT Streaming

Security Requirements

Endpoint

Publish Operation

Subscribe Operation

ApiKey

Configuration

AudioData

Publish Operation

TranscriptionStart

TranscriptionStatus

PartialTranscript

FinalTranscript

Metadata

QuotaAlert

Error

Websocket Close Codes