Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Slurm GCP plugins. #3535

Merged
merged 1 commit into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -310,7 +310,7 @@ limitations under the License.
| <a name="input_enable_external_prolog_epilog"></a> [enable\_external\_prolog\_epilog](#input\_enable\_external\_prolog\_epilog) | Automatically enable a script that will execute prolog and epilog scripts<br/>shared by NFS from the controller to compute nodes. Find more details at:<br/>https://github.com/GoogleCloudPlatform/slurm-gcp/blob/master/tools/prologs-epilogs/README.md | `bool` | `null` | no |
| <a name="input_enable_oslogin"></a> [enable\_oslogin](#input\_enable\_oslogin) | Enables Google Cloud os-login for user login and authentication for VMs.<br/>See https://cloud.google.com/compute/docs/oslogin | `bool` | `true` | no |
| <a name="input_enable_shielded_vm"></a> [enable\_shielded\_vm](#input\_enable\_shielded\_vm) | Enable the Shielded VM configuration. Note: the instance image must support option. | `bool` | `false` | no |
| <a name="input_enable_slurm_gcp_plugins"></a> [enable\_slurm\_gcp\_plugins](#input\_enable\_slurm\_gcp\_plugins) | Enables calling hooks in scripts/slurm\_gcp\_plugins during cluster resume and suspend. | `any` | `false` | no |
| <a name="input_enable_slurm_gcp_plugins"></a> [enable\_slurm\_gcp\_plugins](#input\_enable\_slurm\_gcp\_plugins) | DEPRECATED: Slurm GCP plugins have been deprecated.<br/>Instead of 'max\_hops' plugin please use the 'placement\_max\_distance' nodeset property.<br/>Instead of 'enable\_vpmu' plugin please use 'advanced\_machine\_features.performance\_monitoring\_unit' nodeset property. | `any` | `null` | no |
| <a name="input_enable_smt"></a> [enable\_smt](#input\_enable\_smt) | DEPRECATED: Use `advanced_machine_features.threads_per_core` instead. | `bool` | `null` | no |
| <a name="input_endpoint_versions"></a> [endpoint\_versions](#input\_endpoint\_versions) | Version of the API to use (The compute service is the only API currently supported) | <pre>object({<br/> compute = string<br/> })</pre> | <pre>{<br/> "compute": "beta"<br/>}</pre> | no |
| <a name="input_epilog_scripts"></a> [epilog\_scripts](#input\_epilog\_scripts) | List of scripts to be used for Epilog. Programs for the slurmd to execute<br/>on every node when a user's job completes.<br/>See https://slurm.schedmd.com/slurm.conf.html#OPT_Epilog. | <pre>list(object({<br/> filename = string<br/> content = optional(string)<br/> source = optional(string)<br/> }))</pre> | `[]` | no |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,6 @@ No modules.
| <a name="input_enable_debug_logging"></a> [enable\_debug\_logging](#input\_enable\_debug\_logging) | Enables debug logging mode. Not for production use. | `bool` | `false` | no |
| <a name="input_enable_external_prolog_epilog"></a> [enable\_external\_prolog\_epilog](#input\_enable\_external\_prolog\_epilog) | Automatically enable a script that will execute prolog and epilog scripts<br/>shared by NFS from the controller to compute nodes. Find more details at:<br/>https://github.com/GoogleCloudPlatform/slurm-gcp/blob/v5/tools/prologs-epilogs/README.md | `bool` | `false` | no |
| <a name="input_enable_hybrid"></a> [enable\_hybrid](#input\_enable\_hybrid) | Enables use of hybrid controller mode. When true, controller\_hybrid\_config will<br/>be used instead of controller\_instance\_config and will disable login instances. | `bool` | `false` | no |
| <a name="input_enable_slurm_gcp_plugins"></a> [enable\_slurm\_gcp\_plugins](#input\_enable\_slurm\_gcp\_plugins) | Enables calling hooks in scripts/slurm\_gcp\_plugins during cluster resume and suspend. | `any` | `false` | no |
| <a name="input_endpoint_versions"></a> [endpoint\_versions](#input\_endpoint\_versions) | Version of the API to use (The compute service is the only API currently supported) | <pre>object({<br/> compute = string<br/> })</pre> | <pre>{<br/> "compute": null<br/>}</pre> | no |
| <a name="input_epilog_scripts"></a> [epilog\_scripts](#input\_epilog\_scripts) | List of scripts to be used for Epilog. Programs for the slurmd to execute<br/>on every node when a user's job completes.<br/>See https://slurm.schedmd.com/slurm.conf.html#OPT_Epilog. | <pre>list(object({<br/> filename = string<br/> content = optional(string)<br/> source = optional(string)<br/> }))</pre> | `[]` | no |
| <a name="input_extra_logging_flags"></a> [extra\_logging\_flags](#input\_extra\_logging\_flags) | The only available flag is `trace_api` | `map(bool)` | `{}` | no |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,15 +43,14 @@ locals {
tp = "${local.bucket_dir}/" # prefix to trim from the bucket path to get a "file name"

config = {
enable_slurm_gcp_plugins = var.enable_slurm_gcp_plugins
enable_bigquery_load = var.enable_bigquery_load
cloudsql_secret = var.cloudsql_secret
cluster_id = random_uuid.cluster_id.result
project = var.project_id
slurm_cluster_name = var.slurm_cluster_name
bucket_path = local.bucket_path
enable_debug_logging = var.enable_debug_logging
extra_logging_flags = var.extra_logging_flags
enable_bigquery_load = var.enable_bigquery_load
cloudsql_secret = var.cloudsql_secret
cluster_id = random_uuid.cluster_id.result
project = var.project_id
slurm_cluster_name = var.slurm_cluster_name
bucket_path = local.bucket_path
enable_debug_logging = var.enable_debug_logging
extra_logging_flags = var.extra_logging_flags

# storage
disable_default_mounts = var.disable_default_mounts
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,6 @@
from util import lookup, NSDict
import tpu

import slurm_gcp_plugins

log = logging.getLogger()

PLACEMENT_MAX_CNT = 1500
Expand Down Expand Up @@ -202,14 +200,6 @@ def create_instances_request(nodes: List[str], placement_group: Optional[str], e
targetShape = nodeset.zone_target_shape,
)

if lookup().cfg.enable_slurm_gcp_plugins:
slurm_gcp_plugins.pre_instance_bulk_insert(
lkp=lookup(),
nodes=nodes,
placement_group=placement_group,
request_body=body,
)

req = api_method(
project=lookup().project,
body=body,
Expand Down Expand Up @@ -453,10 +443,7 @@ def create_placement_request(pg_name: str, region: str, max_distance: Optional[i
"maxDistance": max_distance
},
}
if lookup().cfg.enable_slurm_gcp_plugins:
slurm_gcp_plugins.pre_placement_group_insert(
lkp=lookup(), pg_name=pg_name, region=region, request_body=config
)

request = lookup().compute.resourcePolicies().insert(
project=lookup().project, region=region, body=config
)
Expand Down

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Loading
Loading