Gen AL Demo with Kubernetes, Istio Ambient, Prometheus, Kiali etc
We have crafted a few scripts to make this demo run as quickly as possible on your machine once you've installed the prerequisites.
This script will:
- Create a kind cluster
- Install a simple curl client, an ollama service and the demo service.
- Ollama is a Language Model as a Service (LMaaS) that provides a RESTful API for interacting with large language models. It's a great way to get started with LLMs without having to worry about the infrastructure.
./startup.sh
The following two LLM models are used in the demo:
- LLaVa (Large Language and Vision Assistant)
- Llama (Large Language Model Meta AI) 3.2
Pull the two models:
kubectl exec -it deploy/client -- curl http://ollama.ollama:80/api/pull -d '{"name": "llama3.2"}'
kubectl exec -it deploy/client -- curl http://ollama.ollama:80/api/pull -d '{"name": "llava"}'
We use Istio to secure, observe and control the traffic among the microservices in the cluster.
./install-istio.sh
Use port-forwarding to help us access the demo app:
kubectl port-forward svc/demo 8001:8001
To access the demo app, open your browser and navigate to http://localhost:8001
To clean up the demo, run the following command:
./cleanup-istio.sh
./shutdown.sh
This demo has been tested on the following operating systems and will work if you have the prerequisites installed. You may need to build the demo app images yourself if you are on a different platform.
- macOS M2
A portion of the demo in this repo was inspired by the github.com/cncf/llm-in-action repo.