Skip to content

STT Schemas

AsrConfig

Configuration for the job execution

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "model": "stt-general",
  "wait": false,
  "speaker_count": 1,
  "include_filler": false,
  "include_partial_results": false,
  "auto_punctuation": false,
  "enable_spoken_numerals": false,
  "enable_speech_insights": false,
  "enable_voice_insights": false
}

Properties

Name Type Optional Description
model string false The name of ASR model to use.
wait boolean true If set to true, the request blocks until the execution is finished. Otherwise, the request returns job_id which can be used to inquire about the job. For short ASR request, this is typically set to true as the client is expected to wait for the execution.
speaker_count integer true The number of expected speakers.
include_filler boolean true Include fillers returned from engine in the result.
include_partial_results boolean true Include result that are partially complete. This typically does not happen unless the audio is cut in the middle of a sentence.
auto_punctuation boolean true Automatically add punctuations.
enable_spoken_numerals boolean true Automatically convert spoken numerals to digits.
enable_speech_insights boolean true Enable speech insight analytics.
enable_voice_insights boolean true Enable voice insight analytics.

AsrJobRequest

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
{
  "config": {
    "wait": false,
    "speaker_count": 1,
    "include_filler": false,
    "include_partial_results": false,
    "auto_punctuation": false,
    "enable_spoken_numerals": false,
    "enable_speech_insights": false,
    "enable_voice_insights": false
  },
  "request": {
    "label": "Meeting Audio 2021-14-06",
    "data": "<base64-encoded audio data>"
  }
}

Properties

Name Type Optional Description
config AsrConfig false The configuration to run a job with.
request AsrRequest false The request payload to run the job with.

AsrJobsList

List of ASR jobs

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
{
  "pagination": {
    "page": 1,
    "per_page": 10,
    "page_count": 1
  },
  "length": 1,
  "data": [
    {
      "job_id": "2fec34e1efb146f7a7431cb35b64550d",
      "status": "complete",
      "created_at": "2019-08-24T14:15:22Z",
      "modified_at": "2019-08-24T14:15:22Z",
      "request": {
        "label": "Meeting Audio 2021-14-06",
        "uri": "https://..."
      },
      "job_config": {
        "model": "stt-general",
        "wait": false,
        "include_filler": false,
        "include_partial_results": false
      }
    }
  ]
}

Properties

Name Type Optional Description
pagination PaginationInfo true -
length integer false -
data [AsrResponse] false -

AsrModel

Example

1
2
3
4
5
6
7
8
9
{
  "name": "string",
  "label": "string",
  "language": "string",
  "domain": "string",
  "acoustic": "string",
  "channels": 0,
  "samplerate": 0
}

Properties

Name Type Optional Description
name string false Name of the model.
label string true Human readable name of the model.
language string false Human-readable language of the model.
domain string false The specific area or topic covered by the ASR model.
acoustic string false The optimal audio source which gives best recognition result.
channels integer false The optimal number of the channels for the audio data.
samplerate integer false The optimal sample rate for the audio data.

AsrModelList

List of all available TTS models

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
[
  {
    "name": "string",
    "label": "string",
    "language": "string",
    "domain": "string",
    "acoustic": "string",
    "channels": 0,
    "samplerate": 0
  }
]

Properties

Name Type Optional Description
AsrModelList [AsrModel] true List of all available TTS models

AsrRequest

Request payload for an ASR job

Example

1
2
3
4
{
  "label": "Meeting Audio 2021-14-06",
  "data": "<base64 encoded audio>"
}

Properties

Name Type Optional Description
label string true An optional label to give to the job
uri string true An URI to the request payload. Either uri or data must be present at the request but not both.
data string true The audio data in base64 format. Either uri or data must be present at the request but not both.
duration number true The duration of the audio in second. This information is used for progress reporting if available.
mime_type string true Mime Type of the audio.
sample_rate integer true Sample rate of the audio.
channels integer true Number of channels in the audio.

AsrResponse

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
{
  "job_id": "2fec34e1efb146f7a7431cb35b64550d",
  "status": "complete",
  "created_at": "2019-08-24T14:15:22Z",
  "modified_at": "2019-08-24T14:15:22Z",
  "request": {
    "label": "Meeting Audio 2021-14-06",
    "data": null
  },
  "result": {
    "data": [
      {
        "transcript": "hasil akhir dari pekerjaan ini cukup memuaskan",
        "final": true,
        "time_start": 0.0,
        "time_end": 3.6,
        "channel": 0
      }
    ]
  },
  "job_config": {
    "model": "stt-general",
    "wait": false,
    "include_filler": false,
    "include_partial_results": false
  },
  "progress": {
    "total": 0,
    "details": {
      "transfer": 0,
      "transcribe": 0
    }
  },
  "model": {
    "name": "string",
    "label": "string",
    "language": "string",
    "domain": "string",
    "acoustic": "string",
    "channels": 0,
    "samplerate": 0
  }
}

Properties

Name Type Optional Description
job_id string(uuid) false Unique Identifier of a job
status JobStatus false Status of the a job's overall progress
created_at string(date-time) false The time when the job was created
modified_at string(date-time) false The time when the job was last modified
request AsrRequest true Request submitted for the ASR job.
result any true -
Name Type Optional Description
» anonymous JobErrorResult true -
Name Type Optional Description
» anonymous AsrResult true Result of an ASR job
Name Type Optional Description
job_config AsrConfig true Configuration for the job execution.
progress LongAsrProgress true Progress of the ASR job.
model AsrModel true Selected ASR model.

AsrResult

Result of an ASR job

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
{
  "data": [
    {
      "transcript": "Hasil akhir dari pekerjaan ini cukup memuaskan",
      "final": true,
      "time_start": 0,
      "time_end": 3.6,
      "channel": 0
    },
    {
      "transcript": "Hanya saja",
      "final": false,
      "time_start": 0,
      "time_end": 0,
      "channel": 0
    }
  ]
}

Properties

Name Type Optional Description
data [TranscriptionResult] true Transcriptions of the audio
path string true Currently unused
error string true The error that occurred while working on this job.

AsrStatusResponse

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
{
  "job_id": "2fec34e1efb146f7a7431cb35b64550d",
  "status": "complete",
  "created_at": "2019-08-24T14:15:22Z",
  "modified_at": "2019-08-24T14:15:22Z",
  "progress": {
    "total": 0,
    "details": {
      "transfer": 0,
      "transcribe": 0
    }
  }
}

Properties

Name Type Optional Description
job_id string(uuid) false The unique identifier of a job.
status JobStatus false Status of the a job's overall progress.
created_at string(date-time) false The time when the job was created.
modified_at string(date-time) false The time when the job was last modified.
progress LongAsrProgress true Progress of the ASR job.

JobErrorResult

Example

1
2
3
{
  "error": "string"
}

Properties

Name Type Optional Description
error string false The error that occurred

JobStatus

Status of the a job's overall progress

Example

1
"complete"

Properties

Name Type Optional Description
JobStatus string true Status of the a job's overall progress
Enumerated Values
Property Value
JobStatus complete
JobStatus created
JobStatus queued
JobStatus in_progress
JobStatus failed
JobStatus cancelled

LongAsrProgress

Progress of the ASR job.

Example

1
2
3
4
5
6
7
{
  "total": 0,
  "details": {
    "transfer": 0,
    "transcribe": 0
  }
}

Properties

Name Type Optional Description
total number false -
details LongAsrStages false Progress of each stages of the processing.

LongAsrStages

Progress of each stages of the processing.

Example

1
2
3
4
{
  "transfer": 0,
  "transcribe": 0
}

Properties

Name Type Optional Description
transfer number false -
transcribe number false -

PaginationInfo

Example

1
2
3
4
5
6
{
  "page": 0,
  "page_size": 0,
  "page_count": 0,
  "item_count": 0
}

Properties

Name Type Optional Description
page integer false -
page_size integer false -
page_count integer false -
item_count integer false -

SortableFieldAsr

An enumeration.

Example

1
"time"

Properties

Name Type Optional Description
SortableFieldAsr string true An enumeration.
Enumerated Values
Property Value
SortableFieldAsr time
SortableFieldAsr label

SttCompleted

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
{
  "job_id": "063c64da-180d-731e-8000-d11a28529080",
  "created_at": "2023-01-17T14:26:25.505439",
  "modified_at": "2023-01-17T14:26:26.619771",
  "model": {
    "name": "stt-general",
    "label": "ASR General",
    "domain": "general",
    "acoustic": "recording",
    "channels": 1,
    "language": "Bahasa Indonesia",
    "samplerate": 16000
  },
  "result": {
    "data": [
      {
        "final": true,
        "channel": 0,
        "time_end": 6.470000009536743,
        "time_start": 1,
        "transcript": "hasil dari pekerjaan ini cukup memuaskan",
        "speaker_tag": 1
      }
    ]
  },
  "request": {
    "uri": "https://example.domain.name/media/example_audio.wav",
    "label": "Example audio",
    "channels": 1,
    "duration": 6.766625,
    "mime_type": "audio/wav",
    "sample_rate": 8000
  },
  "job_config": {
    "wait": false,
    "engine": "stt-general",
    "speaker_count": 1,
    "include_filler": false,
    "include_partial_results": false
  }
}

Properties

Name Type Optional Description
job_id string(uuid) false Unique Identifier of a job
created_at string(date-time) false The time when the job was created
modified_at string(date-time) false The time when the job was last modified
request AsrRequest true Request submitted for the ASR job.
result AsrResult true There are 2 types of result:

- final: The result returned is final for the given segment
- partial: The engine has not detected the end of the segment.
For partial result, both the time_start and time_end are 0.0
job_config AsrConfig true Configuration for ASR job execution
model AsrModel true Selected ASR model.

SttFailed

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
{
  "job_id": "063c64da-180d-731e-8000-d11a28529080",
  "created_at": "2019-08-24T14:15:22Z",
  "modified_at": "2019-08-24T14:15:22Z",
  "request": {
    "uri": "https://example.domain.name/media/example_audio.wav",
    "label": "Example audio",
    "channels": 1,
    "duration": 6.766625,
    "mime_type": "audio/wav",
    "sample_rate": 8000
  },
  "job_config": {
    "wait": false,
    "engine": "stt-general",
    "speaker_count": 1,
    "include_filler": false,
    "include_partial_results": false
  },
  "model": {
    "name": "stt-general",
    "label": "ASR General",
    "domain": "general",
    "acoustic": "recording",
    "channels": 1,
    "language": "Bahasa Indonesia",
    "samplerate": 16000
  },
  "result": {
    "error": "string"
  }
}

Properties

Name Type Optional Description
job_id string(uuid) false Unique Identifier of a job
created_at string(date-time) false The time when the job was created
modified_at string(date-time) false The time when the job was last modified
request AsrRequest false Request submitted for the ASR job.
job_config AsrConfig false The configuration of the ASR job.
model AsrModel false Selected ASR model.
result JobErrorResult false -

TranscriptionResult

Speech segment transcribed from the audio.

Example

1
2
3
4
5
6
7
{
  "transcript": "Hasil akhir dari pekerjaan ini cukup memuaskan",
  "final": true,
  "time_start": 0,
  "time_end": 3.6,
  "channel": 0
}

Properties

Name Type Optional Description
transcript string false Text resulting from the transcription process.
final boolean false Indicates that the piece of transcription is complete and is not cut in the middle of a sentence.
time_start number false Starting time relative to the start of the audio.
time_end number false Ending time relative to the start of the audio.
channel integer false Channel in which the result is transcribed from.