Websocket server providing various speech related services.
Save a recording with no additional processing.
Accepts one of the following messages:
Start a recording.
Start a recording.
{
"text": "string",
"language": "string",
"age_group": "string",
"prompt_id": 0,
"text_index": 0,
"asr": null
}
Write audio to a recording.
Send WAV data to the recording.
16-bit signed integer PCM WAV data. Minimal sample rate 16khz higher frequencies are downsampled, higher frequencies are still preferred as the quality can be higher. Stereo will be downsampled to mono.
{
"data": []
}
End the recording
End and save the recording.
Get feedback on how well a given prompt was read.
Accepts one of the following messages:
Start a recording.
Start a recording.
{
"text": "string",
"language": "string",
"age_group": "string",
"prompt_id": 0,
"text_index": 0,
"asr": null
}
Write audio to a recording.
Send WAV data to the recording.
16-bit signed integer PCM WAV data. Minimal sample rate 16khz higher frequencies are downsampled, higher frequencies are still preferred as the quality can be higher. Stereo will be downsampled to mono.
{
"data": []
}
End the recording
End and save the recording.
Realtime feedback from the prompt.
Accepts the following message:
Prompt feedback
Feedback on how well the speaker matches the prompt this can include omission words and pronunciation changes.
{
"sentenceConfidence": 0,
"startTime": 0,
"endTime": 0,
"sentenceIndex": 0,
"upsample_p": 0,
"type": "EOS",
"labels": {
"word": "string",
"start": 0,
"end": 0,
"conf": 0,
"type": "string",
"phones": "string",
"textIndex": 0,
"clipping": true
}
}
Get a transcription of the audio.
Accepts one of the following messages:
Start a recording.
Start a recording.
{
"text": "string",
"language": "string",
"age_group": "string",
"prompt_id": 0,
"text_index": 0,
"asr": null
}
Start a diarization.
Start a diarization.
{
"text": "string",
"language": "string",
"age_group": "string",
"prompt_id": 0,
"text_index": 0,
"asr": null
}
Write audio to a recording.
Send WAV data to the recording.
16-bit signed integer PCM WAV data. Minimal sample rate 16khz higher frequencies are downsampled, higher frequencies are still preferred as the quality can be higher. Stereo will be downsampled to mono.
{
"data": []
}
End the recording
End and save the recording.
Realtime transcription of the recording.
Accepts the following message:
Transcription
Transcription of a given recording.
Start a recording.
Start a recording.
Start a diarization.
Start a diarization.
Write audio to a recording.
Send WAV data to the recording.
16-bit signed integer PCM WAV data. Minimal sample rate 16khz higher frequencies are downsampled, higher frequencies are still preferred as the quality can be higher. Stereo will be downsampled to mono.
End the recording
End and save the recording.
Prompt feedback
Feedback on how well the speaker matches the prompt this can include omission words and pronunciation changes.
Transcription
Transcription of a given recording.
16-bit signed integer PCM WAV data. Minimal sample rate 16khz higher frequencies are downsampled, higher frequencies are still preferred as the quality can be higher. Stereo will be downsampled to mono.