Skip to content

Commit

Permalink
Merge pull request #3730 from RachaelSTamakloe/rc-cherry-pick
Browse files Browse the repository at this point in the history
Cherry-pick PR3709, PR3725, PR3729
  • Loading branch information
RachaelSTamakloe authored Feb 26, 2025
2 parents b02eda4 + a575634 commit 44a0c43
Show file tree
Hide file tree
Showing 6 changed files with 816 additions and 1 deletion.
55 changes: 55 additions & 0 deletions examples/machine-learning/a4-highgpu-8g/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# A4 High Blueprints

This document outlines the deployment steps for provisioning A4 High
`a4-highgpu-8g` VMs that use Slurm
as an orchestrator.

## Deployment Instructions

### Build the Cluster Toolkit gcluster binary

Follow instructions
[here](https://cloud.google.com/cluster-toolkit/docs/setup/configure-environment)

### (Optional, but recommended) Create a GCS Bucket for storing terraform state

```bash
#!/bin/bash
TF_STATE_BUCKET_NAME=<your-bucket>
PROJECT_ID=<your-gcp-project>
REGION=<your-preferred-region>

gcloud storage buckets create gs://${TF_STATE_BUCKET_NAME} \
--project=${PROJECT_ID} \
--default-storage-class=STANDARD --location=${REGION} \
--uniform-bucket-level-access
gcloud storage buckets update gs://${TF_STATE_BUCKET_NAME} --versioning
```

### Create/modify the deployment.yaml file with your preferred configuration

For example, set the such as size, reservation to be used, etc, as well as the
name of the bucket that you just created. Below is an example

```yaml
---
terraform_backend_defaults:
type: gcs
configuration:
bucket: TF_STATE_BUCKET_NAME

vars:
deployment_name: a4h-slurm
project_id: <PROJECT_ID>
region: <REGION>
zone: <ZONE>
a4h_reservation_name: <RESERVATION_NAME>
a4h_cluster_size: <RESERVATION_SIZE>
```
### Deploy the cluster
```bash
#!/bin/bash
gcluster deploy -d a4high-slurm-deployment.yaml a4high-slurm-blueprint.yaml
```
Loading

0 comments on commit 44a0c43

Please sign in to comment.