The image above contains a plot of both response time and TTFB for the request body to our streaming PCM endpoint seen below.
mistv2
and mist
models are across the board faster than our v1
models (which were deprecated in February 2025). Using the modelId
parameter, you can select our Mist models for TTS inference.reduceLatency
, which turns off text normalization, to reduce the amount of computation needed to prepare input text for TTS inference. This can safely be used in cases where there are no digits, abbreviations, or tricky punctuation, e.g. Yes, I grew up on one twenty-three Main Street in Oakland, California.
instead of Yes, I grew up on 123 Main St. in Oakland, CA.
samplingRate
parameter.