Models are constantly being trained and finetuned based on user and customer feedback. Please check back often as we push changes frequently.

There are currently 3 models that Rime has in production, arcana, mistv2 and mist.

Arcana, released April 2025, is Rime’s most expressive and lifelike TTS model to date. It pushes the boundary of naturalness and emotional depth in synthesized speech.

  • Highly expressive, natural-sounding speech with emotional nuance
  • Fine-grained control over prosody, pacing, and tone
  • Supports a wide range of vocal demographics, including different ages, accents, and cultural backgrounds
  • Enhanced realism for dynamic, conversational, and character-driven use cases
  • Available via modelId: arcana through Rime’s API endpoints

Mistv2, released February 2025 has the following features:

  • Multi-lingual English + Spanish, plus more languages coming soon
  • More realistic speech with natural and contextual nuances
  • Advanced pronunciation control
  • Ultra-fast on-prem latency of ~70ms, perfect for real-time applications
  • More accents, demographics, and speaking styles

Mist is Rime’s next generation TTS engine, released April 2023, capable of synthesizing conversational speech. Using the modelId parameter for Rime’s TTS endpoints, specifying mistv2 or mist, will allow you to synthesize speech using this newer family of models. As of February 2025, the default value for modelId when unspecified is mist.

Model v1 was released in April 2022 and has been deprecated.