Lambda Cloud#

This guide provides a step-by-step tutorial to deploy AIBrix on a single-node Lambda instance for testing purposes. The setup includes installing dependencies, verifying the installation, setting up the cluster, and deploying AIBrix components.

Prerequisites#

1. Get a Lambda Cloud instance#

You can follow lambda cloud docs to launch an instance.

lambda-cloud-instance

After launching the instance, you can get the instance’s IP address and ssh into the instance.

lambda-cloud-ssh

You can also enter the Jupyter notebook without managing SSH keys.

2. Clone AIBrix code base#

Clone the AIBrix code base to your local machine:

git clone https://github.com/vllm-project/aibrix.git
cd aibrix

3. Install Dependencies#

Run the following script to install the necessary dependencies including nvkind, minikube, kubectl, Helm, Go, and the NVIDIA Container Toolkit.

bash hack/lambda-cloud/install.sh

Summary:

  • Installs required system packages (jq, Go, kubectl, kind, Helm)

  • Installs nvkind and minikube (custom Kubernetes-in-Docker with GPU support)

  • Configures the NVIDIA Container Toolkit

  • Updates Docker settings for GPU compatibility

Once completed, restart your terminal or run:

source ~/.bashrc

4. Verify GPU container runtime#

The path for the following script assumes that you are in the root directory of the AIBrix repository.

Run the following script to ensure that the NVIDIA drivers and Docker integration are correctly configured:

bash ./hack/lambda-cloud/verify.sh

Summary:

  • Runs nvidia-smi to check GPU availability

  • Runs a Docker container with NVIDIA runtime to verify GPU detection

  • Ensures that GPU devices are accessible within containers

If all checks pass successfully like below, proceed to the next step.

Setup Kubernetes Environments#

We provide two ways to set up the environment:

  1. Minikube (Recommended): Minikube is the recommended option for local testing and development. It provides a stable, single-node Kubernetes cluster.

  2. Kind: Kind can also be used, but due to the container-based nature of Kind clusters, we often observe that containers unexpectedly receive SIGTERM signals.(see issue#683 and issue#684).

Attention

The root cause has not yet been fully determined. Therefore, we recommend using Minikube whenever possible for a more stable experience.

MiniKube#

1. Create a minikube Cluster#

First, ensure that your non-root user has docker permission:

sudo usermod -aG docker $USER
newgrp docker

Create a Kubernetes cluster using minikube:

minikube start --driver=docker --container-runtime=docker --gpus=all --cpus=8 --memory=16g
πŸ˜„  minikube v1.35.0 on Ubuntu 22.04 (kvm/amd64)
✨  Using the docker driver based on user configuration
πŸ“Œ  Using Docker driver with root privileges
πŸ‘  Starting "minikube" primary control-plane node in "minikube" cluster
🚜  Pulling base image v0.0.46 ...
πŸ’Ύ  Downloading Kubernetes v1.32.0 preload ...
    > preloaded-images-k8s-v18-v1...:  333.57 MiB / 333.57 MiB  100.00% 230.98
πŸ”₯  Creating docker container (CPUs=8, Memory=16384MB) ...
🐳  Preparing Kubernetes v1.32.0 on Docker 27.4.1 ...
    β–ͺ Generating certificates and keys ...
    β–ͺ Booting up control plane ...
    β–ͺ Configuring RBAC rules ...
πŸ”—  Configuring bridge CNI (Container Networking Interface) ...
πŸ”Ž  Verifying Kubernetes components...
    β–ͺ Using image nvcr.io/nvidia/k8s-device-plugin:v0.17.0
    β–ͺ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, nvidia-device-plugin, default-storageclass
πŸ’‘  kubectl not found. If you need it, try: 'minikube kubectl -- get pods -A'
πŸ„  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

Note

  1. If you meet problems setup the cluster and enable the GPU support, check Using NVIDIA GPUs with minikube for more details.

  2. GPU operator will be automatially installed along with the cluster setup, we do not need to install it separately.

2. Enable LoadBalancer service in minikube#

Services of type LoadBalancer can be exposed via the minikube tunnel command. It must be run in a separate terminal window to keep the LoadBalancer running. Check LoadBalancer access for more details.

Run the tunnel in a separate terminal and do not close this window.

minikube tunnel

Password:
 Status:
 machine: minikube
 pid: 39087
 route: 10.96.0.0/12 -> 192.168.64.194
 minikube: Running
 services: [hello-minikube]
     errors:
   minikube: no errors
   router: no errors
   loadbalancer emulator: no errors

3. Delete the minikube cluster#

Once you’ve done testing, you can delete the minikube cluster:

minikube delete

Kind#

1. Create a nvkind Cluster#

Create a Kubernetes cluster using nvkind:

nvkind cluster create --config-template=./hack/lambda-cloud/nvkind-cluster.yaml

This will set up a single-node cluster with GPU support. Make sure you see Ready status for the node:

kubectl get nodes

2. Setup NVIDIA GPU Operator#

Run the following script to install the NVIDIA GPU Operator and configure the cloud provider:

bash ./hack/lambda-cloud/setup.sh

Summary:

  • Installs the NVIDIA GPU Operator using Helm

  • Installs the Cloud Provider Kind (cloud-provider-kind)

  • Runs cloud-provider-kind in the background for cloud integration

3. Delete Lambda Kind Cluster#

Once you’ve done testing, you can delete the nvkind cluster:

# get your cluster name
kind get clusters

kind delete clusters nvkind-7kx6v # nvkind-7kx6v is the cluster name in this example

Install AIBrix#

Once the cluster is up and running, install AIBrix components:

Install dependencies:

# install dependencies
kubectl apply -f "https://github.com/vllm-project/aibrix/releases/download/v0.4.1/aibrix-dependency-v0.4.1.yaml" --server-side

# install core components
kubectl create -f "https://github.com/vllm-project/aibrix/releases/download/v0.4.1/aibrix-core-v0.4.1.yaml"

Verify that the AIBrix components are installed successfully:

kubectl get pods -n aibrix-system

Now, you can follow Quickstart to deploy your models.

Conclusion#

You have successfully deployed AIBrix on a single-node Lambda instance. This setup allows for efficient testing and debugging of AIBrix components in a local environment.

If you encounter issues, ensure that:

  • The NVIDIA GPU Operator is correctly installed

  • The cluster has GPU resources available (kubectl describe nodes)

  • Docker and Kubernetes configurations match GPU compatibility requirements

Happy Testing!