This document provides information about the Helm chart for deploying Rime Labs services on Kubernetes.
Chart Overview
The Helm chart deploys a two-tier application consisting of an API service and a model service. The API service communicates with the model service for inference operations.
Prerequisites
Kubernetes 1.19+
Helm 3.0+
NVIDIA GPU Operator installed (for GPU support)
PV provisioner support in the underlying infrastructure (if using persistent storage)
Chart Structure
rime-labs/
├── Chart.yaml
├── values.yaml
├── templates/
│ ├── _helpers.tpl
│ ├── deployment-api.yaml
│ ├── deployment-model.yaml
│ ├── service-api.yaml
│ ├── service-model.yaml
│ ├── configmap.yaml
│ ├── serviceaccount.yaml
│ └── NOTES.txt
└── charts/
Installation
helm install rime-labs ./rime-labs
Example values.yaml
api:
image:
repository: rime/api
tag: 0a111d625e17
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 8000
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 1000m
memory: 2Gi
env:
- name: MODEL_URL
value: "http://{{ .Release.Name }}-model:8080/invocations"
model:
image:
repository: rime/model
tag: 7bd3a89c3b05
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 8080
gpu:
enabled: true
count: all
resources:
limits:
nvidia.com/gpu: 1
cpu: 2000m
memory: 10Gi
requests:
cpu: 2000m
memory: 10Gi
Example Deployment Templates
Here’s a simplified example of what the deployment templates might look like:
API Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "rime-labs.fullname" . }}-api
spec:
replicas: {{ .Values.api.replicaCount | default 1 }}
selector:
matchLabels:
app.kubernetes.io/name: {{ include "rime-labs.name" . }}-api
template:
metadata:
labels:
app.kubernetes.io/name: {{ include "rime-labs.name" . }}-api
spec:
containers:
- name: api
image: "{{ .Values.api.image.repository }}:{{ .Values.api.image.tag }}"
imagePullPolicy: {{ .Values.api.image.pullPolicy }}
ports:
- containerPort: 8000
env:
{{- range .Values.api.env }}
- name: {{ .name }}
value: {{ .value }}
{{- end }}
resources:
{{- toYaml .Values.api.resources | nindent 12 }}
Model Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "rime-labs.fullname" . }}-model
spec:
replicas: {{ .Values.model.replicaCount | default 1 }}
selector:
matchLabels:
app.kubernetes.io/name: {{ include "rime-labs.name" . }}-model
template:
metadata:
labels:
app.kubernetes.io/name: {{ include "rime-labs.name" . }}-model
spec:
containers:
- name: model
image: "{{ .Values.model.image.repository }}:{{ .Values.model.image.tag }}"
imagePullPolicy: {{ .Values.model.image.pullPolicy }}
ports:
- containerPort: 8080
resources:
{{- toYaml .Values.model.resources | nindent 12 }}
{{- if .Values.model.gpu.enabled }}
nodeSelector:
accelerator: nvidia-gpu
{{- end }}
Troubleshooting
Common Issues
- GPU not recognized: Ensure the NVIDIA GPU Operator is installed correctly in your cluster.
- Services cannot communicate: Verify that service names are correctly referenced in environment variables.
- Resource constraints: If pods are in a pending state, check if you have sufficient resources (CPU, memory, GPUs) in your cluster.