.
, ?
, !
. This is most pertinent for the initial messages sent to the API, as synthesis won’t begin until there are sufficient tokens to generate audio with natural prosody. After the first synthesis of any given utterance, typically enough time has elapsed that subsequent audio contains multiple clauses, and the buffering becomes largely invisible.
contextId: null
, and the audio for the second will be tagged with its UUID.
mistv2
for Rime’s fastest, most accurate, and most customizable model, or mist
for Rime’s earlier model (default: mist
)mp3
, mulaw
, or pcm