Skip to content

Commit

Permalink
Merge pull request #3535 from mr0re1/enough_plugins
Browse files Browse the repository at this point in the history
Deprecate Slurm GCP plugins.
  • Loading branch information
mr0re1 authored Jan 17, 2025
2 parents a99bd5b + a9e821c commit 81a8dc6
Show file tree
Hide file tree
Showing 14 changed files with 30 additions and 425 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -310,7 +310,7 @@ limitations under the License.
| <a name="input_enable_external_prolog_epilog"></a> [enable\_external\_prolog\_epilog](#input\_enable\_external\_prolog\_epilog) | Automatically enable a script that will execute prolog and epilog scripts<br/>shared by NFS from the controller to compute nodes. Find more details at:<br/>https://github.com/GoogleCloudPlatform/slurm-gcp/blob/master/tools/prologs-epilogs/README.md | `bool` | `null` | no |
| <a name="input_enable_oslogin"></a> [enable\_oslogin](#input\_enable\_oslogin) | Enables Google Cloud os-login for user login and authentication for VMs.<br/>See https://cloud.google.com/compute/docs/oslogin | `bool` | `true` | no |
| <a name="input_enable_shielded_vm"></a> [enable\_shielded\_vm](#input\_enable\_shielded\_vm) | Enable the Shielded VM configuration. Note: the instance image must support option. | `bool` | `false` | no |
| <a name="input_enable_slurm_gcp_plugins"></a> [enable\_slurm\_gcp\_plugins](#input\_enable\_slurm\_gcp\_plugins) | Enables calling hooks in scripts/slurm\_gcp\_plugins during cluster resume and suspend. | `any` | `false` | no |
| <a name="input_enable_slurm_gcp_plugins"></a> [enable\_slurm\_gcp\_plugins](#input\_enable\_slurm\_gcp\_plugins) | DEPRECATED: Slurm GCP plugins have been deprecated.<br/>Instead of 'max\_hops' plugin please use the 'placement\_max\_distance' nodeset property.<br/>Instead of 'enable\_vpmu' plugin please use 'advanced\_machine\_features.performance\_monitoring\_unit' nodeset property. | `any` | `null` | no |
| <a name="input_enable_smt"></a> [enable\_smt](#input\_enable\_smt) | DEPRECATED: Use `advanced_machine_features.threads_per_core` instead. | `bool` | `null` | no |
| <a name="input_endpoint_versions"></a> [endpoint\_versions](#input\_endpoint\_versions) | Version of the API to use (The compute service is the only API currently supported) | <pre>object({<br/> compute = string<br/> })</pre> | <pre>{<br/> "compute": "beta"<br/>}</pre> | no |
| <a name="input_epilog_scripts"></a> [epilog\_scripts](#input\_epilog\_scripts) | List of scripts to be used for Epilog. Programs for the slurmd to execute<br/>on every node when a user's job completes.<br/>See https://slurm.schedmd.com/slurm.conf.html#OPT_Epilog. | <pre>list(object({<br/> filename = string<br/> content = optional(string)<br/> source = optional(string)<br/> }))</pre> | `[]` | no |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,6 @@ No modules.
| <a name="input_enable_debug_logging"></a> [enable\_debug\_logging](#input\_enable\_debug\_logging) | Enables debug logging mode. Not for production use. | `bool` | `false` | no |
| <a name="input_enable_external_prolog_epilog"></a> [enable\_external\_prolog\_epilog](#input\_enable\_external\_prolog\_epilog) | Automatically enable a script that will execute prolog and epilog scripts<br/>shared by NFS from the controller to compute nodes. Find more details at:<br/>https://github.com/GoogleCloudPlatform/slurm-gcp/blob/v5/tools/prologs-epilogs/README.md | `bool` | `false` | no |
| <a name="input_enable_hybrid"></a> [enable\_hybrid](#input\_enable\_hybrid) | Enables use of hybrid controller mode. When true, controller\_hybrid\_config will<br/>be used instead of controller\_instance\_config and will disable login instances. | `bool` | `false` | no |
| <a name="input_enable_slurm_gcp_plugins"></a> [enable\_slurm\_gcp\_plugins](#input\_enable\_slurm\_gcp\_plugins) | Enables calling hooks in scripts/slurm\_gcp\_plugins during cluster resume and suspend. | `any` | `false` | no |
| <a name="input_endpoint_versions"></a> [endpoint\_versions](#input\_endpoint\_versions) | Version of the API to use (The compute service is the only API currently supported) | <pre>object({<br/> compute = string<br/> })</pre> | <pre>{<br/> "compute": null<br/>}</pre> | no |
| <a name="input_epilog_scripts"></a> [epilog\_scripts](#input\_epilog\_scripts) | List of scripts to be used for Epilog. Programs for the slurmd to execute<br/>on every node when a user's job completes.<br/>See https://slurm.schedmd.com/slurm.conf.html#OPT_Epilog. | <pre>list(object({<br/> filename = string<br/> content = optional(string)<br/> source = optional(string)<br/> }))</pre> | `[]` | no |
| <a name="input_extra_logging_flags"></a> [extra\_logging\_flags](#input\_extra\_logging\_flags) | The only available flag is `trace_api` | `map(bool)` | `{}` | no |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,15 +43,14 @@ locals {
tp = "${local.bucket_dir}/" # prefix to trim from the bucket path to get a "file name"

config = {
enable_slurm_gcp_plugins = var.enable_slurm_gcp_plugins
enable_bigquery_load = var.enable_bigquery_load
cloudsql_secret = var.cloudsql_secret
cluster_id = random_uuid.cluster_id.result
project = var.project_id
slurm_cluster_name = var.slurm_cluster_name
bucket_path = local.bucket_path
enable_debug_logging = var.enable_debug_logging
extra_logging_flags = var.extra_logging_flags
enable_bigquery_load = var.enable_bigquery_load
cloudsql_secret = var.cloudsql_secret
cluster_id = random_uuid.cluster_id.result
project = var.project_id
slurm_cluster_name = var.slurm_cluster_name
bucket_path = local.bucket_path
enable_debug_logging = var.enable_debug_logging
extra_logging_flags = var.extra_logging_flags

# storage
disable_default_mounts = var.disable_default_mounts
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,6 @@
from util import lookup, NSDict
import tpu

import slurm_gcp_plugins

log = logging.getLogger()

PLACEMENT_MAX_CNT = 1500
Expand Down Expand Up @@ -202,14 +200,6 @@ def create_instances_request(nodes: List[str], placement_group: Optional[str], e
targetShape = nodeset.zone_target_shape,
)

if lookup().cfg.enable_slurm_gcp_plugins:
slurm_gcp_plugins.pre_instance_bulk_insert(
lkp=lookup(),
nodes=nodes,
placement_group=placement_group,
request_body=body,
)

req = api_method(
project=lookup().project,
body=body,
Expand Down Expand Up @@ -453,10 +443,7 @@ def create_placement_request(pg_name: str, region: str, max_distance: Optional[i
"maxDistance": max_distance
},
}
if lookup().cfg.enable_slurm_gcp_plugins:
slurm_gcp_plugins.pre_placement_group_insert(
lkp=lookup(), pg_name=pg_name, region=region, request_body=config
)

request = lookup().compute.resourcePolicies().insert(
project=lookup().project, region=region, body=config
)
Expand Down

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Loading

0 comments on commit 81a8dc6

Please sign in to comment.