This document provides information about the Helm chart for deploying Rime Labs services on Kubernetes.

Chart Overview

The Helm chart deploys a two-tier application consisting of an API service and a model service. The API service communicates with the model service for inference operations.

Prerequisites

Kubernetes 1.19+ Helm 3.0+ NVIDIA GPU Operator installed (for GPU support) PV provisioner support in the underlying infrastructure (if using persistent storage)

Chart Structure

rime-labs/
├── Chart.yaml
├── values.yaml
├── templates/
│   ├── _helpers.tpl
│   ├── deployment-api.yaml
│   ├── deployment-model.yaml
│   ├── service-api.yaml
│   ├── service-model.yaml
│   ├── configmap.yaml
│   ├── serviceaccount.yaml
│   └── NOTES.txt
└── charts/

Installation

helm install rime-labs ./rime-labs

Example values.yaml

api:
  image:
    repository: rime/api
    tag: 0a111d625e17
    pullPolicy: IfNotPresent
  service:
    type: ClusterIP
    port: 8000
  resources:
    limits:
      cpu: 1000m
      memory: 2Gi
    requests:
      cpu: 1000m
      memory: 2Gi
  env:
    - name: MODEL_URL
      value: "http://{{ .Release.Name }}-model:8080/invocations"

model:
  image:
    repository: rime/model
    tag: 7bd3a89c3b05
    pullPolicy: IfNotPresent
  service:
    type: ClusterIP
    port: 8080
  gpu:
    enabled: true
    count: all
  resources:
    limits:
      nvidia.com/gpu: 1
      cpu: 2000m
      memory: 10Gi
    requests:
      cpu: 2000m
      memory: 10Gi

Example Deployment Templates

Here’s a simplified example of what the deployment templates might look like:

API Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "rime-labs.fullname" . }}-api
spec:
  replicas: {{ .Values.api.replicaCount | default 1 }}
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ include "rime-labs.name" . }}-api
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ include "rime-labs.name" . }}-api
    spec:
      containers:
        - name: api
          image: "{{ .Values.api.image.repository }}:{{ .Values.api.image.tag }}"
          imagePullPolicy: {{ .Values.api.image.pullPolicy }}
          ports:
            - containerPort: 8000
          env:
            {{- range .Values.api.env }}
            - name: {{ .name }}
              value: {{ .value }}
            {{- end }}
          resources:
            {{- toYaml .Values.api.resources | nindent 12 }}

Model Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "rime-labs.fullname" . }}-model
spec:
  replicas: {{ .Values.model.replicaCount | default 1 }}
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ include "rime-labs.name" . }}-model
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ include "rime-labs.name" . }}-model
    spec:
      containers:
        - name: model
          image: "{{ .Values.model.image.repository }}:{{ .Values.model.image.tag }}"
          imagePullPolicy: {{ .Values.model.image.pullPolicy }}
          ports:
            - containerPort: 8080
          resources:
            {{- toYaml .Values.model.resources | nindent 12 }}
      {{- if .Values.model.gpu.enabled }}
      nodeSelector:
        accelerator: nvidia-gpu
      {{- end }}

Troubleshooting

Common Issues

  1. GPU not recognized: Ensure the NVIDIA GPU Operator is installed correctly in your cluster.
  2. Services cannot communicate: Verify that service names are correctly referenced in environment variables.
  3. Resource constraints: If pods are in a pending state, check if you have sufficient resources (CPU, memory, GPUs) in your cluster.