Deploy on Google Kubernetes Engine
Learn how to deploy LLMstudio as a containerized application on Google Kubernetes Engine and make calls from a local repository.
Prerequisites
To follow this guide you need to have the following set-up:
- A project on google cloud platform.
- Kubernetes Engine API enabled on your project.
- Kubernetes Engine Admin role for the user performing the guide.
Deploy LLMstudio
This example demonstrates a public deployment. For a private service accessible only within your enterprise infrastructure, deploy it within your own Virtual Private Cloud (VPC).
Navigate to Kubernetes Engine
Begin by navigating to the Kubernetes Engine page.
Select Deploy
Go to Workloads and Create a new Deployment.
Name Your Deployment
Rename your project. We will call the one in this guide llmstudio-on-gcp.
Select Your Cluster
Choose between creating a new cluster or using an existing cluster. For this guide, we will create a new cluster and use the default region.
Proceed to Container Details
Once done done with the Deployment configuration, proceed to Container details.
Set Image Path
In the new container section, select Existing container image.
Copy the path to LLMstudio’s image available on Docker Hub.
Set it as the Image path to your container.
Set Environment Variables
Configure the following mandatory environment variables:
Environment Variable | Value |
---|---|
LLMSTUDIO_ENGINE_HOST | 0.0.0.0 |
LLMSTUDIO_ENGINE_PORT | 8001 |
LLMSTUDIO_TRACKING_HOST | 0.0.0.0 |
LLMSTUDIO_TRACKING_PORT | 8002 |
Additionally, set the GOOGLE_API_KEY
environment variable to enable calls to Google’s Gemini models.
Proceed to Expose (Optional)
After configuring your container, proceed to Expose (Optional).
Expose Ports
Select Expose deployment as a new service and leave the first item as is.
Add two other items, and expose the ports defined in the Set Environment Variables step.
Deploy
After setting up and exposing the ports, press Deploy.
Make a Call
Now let’s make a call to our LLMstudio instance on GCP!
Set Up Project
Setup a simple project with this two files:
calls.ipynb
.env
Set Up Files
Go to your newly deployed Workload, scroll to the Exposing services section, and take note of the Host of your endpoint.
Create your .env
file with the following: