Google Cloud Platform (GCP)#
Introduction#
This module deploys an AIBrix cluster in its entirety onto a Google Container Cluster. It is the quickest way to get up and running with AIBrix. The purpose of this module is to both allow developers to quickly spin up the stack, and allow for the team to test on the Google Cloud Platform.
Warning
This module was created to allow users to quickly spin up AIBrix on GCP. It is not currently built for production deployments. The user is responsible for any costs incurred by running this module.
This module use terraform as the infrastructure as code tool. If you are looking for other means, feel free to cut an issue.
Quickstart#
Prerequisites#
A quota of at least 1 GPU within your GCP project. More information can be found on the topic here.
Steps#
Change directory to the module location:
cd /deployment/terraform/gcpRun
gcloud auth application-default loginto setup credentials to Google.Install cluster auth plugin with
gcloud components install gke-gcloud-auth-plugin.Rename
terraform.tfvars.exampletoterraform.tfvarsand fill in the required variables. You can also add any optional overrides here as well.Run
terraform initto initialize the module.Run
terraform planto see details on the resources created by this module.When you are satisfied with the plan and want to create the resources, run
terraform apply.Note
If you receive
NodePool aibrix-gpu-nodes was created in the error state "ERROR"while running the script, check your quotas for GPUs and the specific instances you’re trying to deploy.Wait for module to complete running. It will output a command to receive the kubernetes config file and a public IP address.
Run a command against the public IP:
ENDPOINT="<YOUR PUBLIC IP>" curl http://${ENDPOINT}/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-r1-distill-llama-8b", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "help me write a random generator in python"} ] }'
When you are finished testing and no longer want the resources, run
terraform destroy.
Warning
Ensure that you complete this step once you are done trying it out, as GPUs are expensive.
Inputs#
Name |
Description |
Type |
Default |
Required |
|---|---|---|---|---|
aibrix_release_version |
The version of AIBrix to deploy. |
string |
“v0.2.0” |
no |
cluster_name |
Name of the GKE cluster. |
string |
“aibrix-inference-cluster” |
no |
cluster_zone |
Zone to deploy cluster within. If not provided will be deployed to default region. |
string |
“” |
no |
default_region |
Default region to deploy resources within. |
string |
n/a |
yes |
deploy_example_model |
Whether to deploy the example model. |
bool |
true |
no |
node_pool_machine_count |
Machine count for the node pool. |
number |
1 |
no |
node_pool_machine_type |
Machine type for the node pool. Must be in the A3, A2, or G2 series. |
string |
“g2-standard-4” |
no |
node_pool_name |
Name of the GPU node pool. |
string |
“aibrix-gpu-nodes” |
no |
node_pool_zone |
Zone to deploy GPU node pool within. If not provided will be deployed to zone in default region which has capacity for machine type. |
string |
“” |
no |
project_id |
GCP project to deploy resources within. |
string |
n/a |
yes |
Outputs#
Name |
Description |
|---|---|
aibrix_service_public_ip |
Public IP address for AIBrix service. |
configure_kubectl_command |
Command to run which will allow kubectl access. |
Modules#
Name |
Source |
Version |
|---|---|---|
aibrix |
deployment/terraform/kubernetes |
n/a |
cluster |
deployment/terraform/gcp/cluster |
n/a |
Providers#
Name |
Version |
|---|---|
6.22.0 |
|
kubernetes |
2.36.0 |