Skip to main content
Rime exposes a broad portfolio of production-ready voices across our models. Each voice is designed for real-time conversational performance, with distinct tonal identity, emotional range, and demographic diversity. As of 4 February 2026, Rime supports the following languages:
  • Arabic lang=ara or lang=ar (Arcana only)
  • English lang=eng or lang=en
  • French lang=fra or lang=fr
  • German lang=ger or lang=de
  • Hebrew lang=heb or lang=he (Arcana only)
  • Hindi lang=hin or lang=hi (Arcana only)
  • Japanese lang=jpn or lang=ja (Arcana only)
  • Portuguese lang=por or lang=pt (Arcana only)
  • Spanish lang=spa or lang=es

Arcana v3 Voices (Flagship)

Arcana v3 is Rime’s most expressive and human-realistic text-to-speech model. The voices in Arcana are designed to sound natural in live conversation — capturing pacing, rhythm, breath, subtle emotional shifts, and multilingual fluency without sacrificing responsiveness. Arcana v3 supports native multilingual speech and code-switching.

The 94 Arcana Flagship Voices

Arcana v3 includes 94 flagship voices spanning a wide tonal and demographic spectrum. Rather than optimizing for a single “neutral assistant” sound, Arcana voices are intentionally distinct and characterful. At a high level, the range includes:
  • Age diversity — young adult through older adult voices
  • Regional variety — American regional accents (Southern, Midwestern, West Coast, East Coast), as well as international English and native speakers across supported languages
  • Cultural representation — African American, Latina/o, Asian American, and other globally representative identities
  • Gender spectrum — balanced representation across male and female voices, including stylistic variation within each
  • Energy profiles — from calm and grounded to upbeat, playful, warm, authoritative, or highly conversational
  • Use-case fit — voices well-suited for IVR, customer support, healthcare, fintech, education, media narration, and expressive AI companions
Arcana voices are built to maintain voice identity even when switching languages mid-utterance, making them especially powerful for global conversational applications.

Mist v2 Voices

Mist v2 is optimized for speed, precision, and control. While Arcana emphasizes expressive realism, Mist focuses on clarity, pronunciation accuracy, and high-throughput production environments. Mist voices are:
  • Highly intelligible and consistent
  • Well-suited for structured IVR systems and high-volume applications
  • Optimized for fast synthesis and predictable delivery
  • Designed to handle complex proper nouns, brand names, and domain-specific terminology cleanly
Mist v2 provides a curated set of dependable, production-stable voices that prioritize control and scalability.

Accessing Voices

Detailed voice metadata is available: Use the speaker parameter in the TTS API to select a voice.