Websocket server providing various speech related services.
Save a recording with no additional processing.
Available only on servers:
Accepts one of the following messages:
Start a recording.
Start a recording.
{
"text": "string",
"language": "string",
"age_group": "string",
"prompt_id": 0,
"text_index": 0,
"asr": null
}
Write audio to a recording.
Send WAV data to the recording.
16-bit signed integer PCM WAV data. Minimal sample rate 16khz higher frequencies are downsampled, higher frequencies are still preferred as the quality can be higher. Stereo will be downsampled to mono.
{
"data": []
}
End the recording
End and save the recording.
Get feedback on how well a given prompt was read.
Available only on servers:
Accepts one of the following messages:
Start a recording.
Start a recording.
{
"text": "string",
"language": "string",
"age_group": "string",
"prompt_id": 0,
"text_index": 0,
"asr": null
}
Write audio to a recording.
Send WAV data to the recording.
16-bit signed integer PCM WAV data. Minimal sample rate 16khz higher frequencies are downsampled, higher frequencies are still preferred as the quality can be higher. Stereo will be downsampled to mono.
{
"data": []
}
End the recording
End and save the recording.
Realtime feedback from the prompt.
Available only on servers:
Accepts the following message:
Prompt feedback
Feedback on how well the speaker matches the prompt this can include omission words and pronunciation changes.
{
"sentenceConfidence": 0,
"startTime": 0,
"endTime": 0,
"sentenceIndex": 0,
"upsample_p": 0,
"type": "EOS",
"labels": {
"word": "string",
"start": 0,
"end": 0,
"conf": 0,
"type": "string",
"phones": "string",
"textIndex": 0,
"clipping": true
}
}
Get a transcription of the audio.
Available only on servers:
Accepts one of the following messages:
Start a recording.
Start a recording.
{
"text": "string",
"language": "string",
"age_group": "string",
"prompt_id": 0,
"text_index": 0,
"asr": null
}
Start a diarization.
Start a diarization.
{
"text": "string",
"language": "string",
"age_group": "string",
"prompt_id": 0,
"text_index": 0,
"asr": null
}
Write audio to a recording.
Send WAV data to the recording.
16-bit signed integer PCM WAV data. Minimal sample rate 16khz higher frequencies are downsampled, higher frequencies are still preferred as the quality can be higher. Stereo will be downsampled to mono.
{
"data": []
}
End the recording
End and save the recording.
Realtime transcription of the recording.
Available only on servers:
Accepts the following message:
Transcription
Transcription of a given recording.
Start a recording.
Start a recording.
Start a diarization.
Start a diarization.
Write audio to a recording.
Send WAV data to the recording.
16-bit signed integer PCM WAV data. Minimal sample rate 16khz higher frequencies are downsampled, higher frequencies are still preferred as the quality can be higher. Stereo will be downsampled to mono.
End the recording
End and save the recording.
Prompt feedback
Feedback on how well the speaker matches the prompt this can include omission words and pronunciation changes.
Transcription
Transcription of a given recording.
16-bit signed integer PCM WAV data. Minimal sample rate 16khz higher frequencies are downsampled, higher frequencies are still preferred as the quality can be higher. Stereo will be downsampled to mono.