Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.rime.ai/llms.txt

Use this file to discover all available pages before exploring further.

Rime provides regional API endpoints so you can route requests to the data center closest to your application’s deployment, reducing network latency.

HTTP endpoints

Use these endpoints for standard HTTP and SSE requests:
EndpointRegion
https://users.rime.aiUS West — default (alias for users-west)
https://users-west.rime.aiUS West (us-west-2)
https://users-east.rime.aiUS East (us-east-1)

WebSocket endpoints

Use these endpoints for real-time streaming over WebSocket. The regional routing pattern applies to every WebSocket path (/ws, /ws2, /ws3) — the /ws3 examples below are shown because it’s Rime’s flagship WebSocket endpoint.
EndpointRegion
wss://users-ws.rime.ai/ws3US West (us-west-2)
wss://users-east-ws.rime.ai/ws3US East (us-east-1)

Typical network latency

Use these rough round-trip times (RTT) — the back-and-forth network delay between your application and Rime — to pick the closest endpoint.
From metroTo US EastTo US West
East Coast (NYC, DC, Boston, Atlanta)5–25 ms60–85 ms
Midwest / South (Chicago, Dallas, Denver)25–55 ms35–65 ms
West Coast (SF, LA, Seattle)60–85 ms5–25 ms
Rules of thumb:
  • Same region (your app and the Rime endpoint in the same AWS region): typically 1–10 ms.
  • Coast-to-coast: ~60 ms is the physical floor, set by the speed of light in fiber across ~2,500 miles. 60–85 ms is normal.
  • Above 90 ms between major US metros usually points to a suboptimal network route, not Rime.
For voice AI: if you serve users nationwide from a single region, far-coast users will see 60–85 ms of network round-trip on top of Rime’s inference time. To stay comfortably under 200 ms end-to-end, route East-coast users to US East and West-coast users to US West. For the full latency picture — including Rime’s model-inference time, streaming TTFB, and the H100 benchmark numbers — see Latency.

Choosing a region

Choose the endpoint in the region closest to where your application is deployed. For example:
  • If your server runs in AWS us-west-2, use users-west.rime.ai / users-ws.rime.ai.
  • If your server runs in AWS us-east-1, use users-east.rime.ai / users-east-ws.rime.ai.
Use the rime speedtest command to measure TTFB against each endpoint from your machine.