Skip to content

Commit

Permalink
Add Azure Batch VM size guidance
Browse files Browse the repository at this point in the history
  • Loading branch information
adamrtalbot committed Dec 2, 2024
1 parent 17081e1 commit 139a8c8
Show file tree
Hide file tree
Showing 2 changed files with 109 additions and 67 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -177,69 +177,12 @@ Create a Batch Forge compute environment:

## Manual

It is possible to set up Seqera Platform to use a pre-existing Azure Batch pool. This allows the use of more advanced Azure Batch features, such as custom images and private networking.
This section is for users with a pre-configured Batch pool. This requires an existing Azure Batch account with an existing pool.

:::caution
Your Seqera compute environment uses resources that you may be charged for in your Azure account. See [Cloud costs](../monitoring/cloud-costs.mdx) for guidelines to manage cloud resources effectively and prevent unexpected costs.
:::

**Create a Nextflow compatible Azure Batch pool**

If not mentioned below, please use the default settings.

1. **Account**:You must have an existing Azure Batch account. Ideally, you would already have demonstrated you can run an Azure Batch task within this account. Any type of account is compatible.
1. *Quota*: You must check you have sufficient quota for the number of pools, jobs and vCPUs per series. See [Azure Batch service quotas and limits][az-batch-quotas] for more information.
1. On the Azure Batch page of the Azure Portal, select **Pools** and then **+ Add**.
1. **Name**: Enter a Pool ID and Display Name. The ID is the one we will refer to in the Seqera Platform and/or Nextflow.
1. **Identity**: Select **User assigned** to use a managed identity for the pool. Click the "Add" for User-assigned managed identity and select the Managed Identity with the correct permissions to the storage account and Batch account.
1. **Operating System**: It is possible to use any Linux based image here, however we recommend using it with a Microsoft Azure Batch provided image. Note, there are two generations of Azure Virtual Machine images and certain VM series are only available in one generation. See [Azure Virtual Machine series][az-vm-gen] for more information. For default settings, please select the following:
- **Publisher**: `microsoft-azure-batch`
- **Offer**: `ubuntu-server-container-rdma`
- **Sku**: `20.04 LTS`
- **Security type**: `standard`
1. **OS disk storage account type**: Certain VM series only support a specific storage account type. See [Azure managed disk types][az-disk-type] and [Azure Virtual Machine series][az-vm-gen] for more information. In general, a VM series with the suffix *s* will support *Premium LRS* storage account type, e.g. a `standard_e16ds_v5` will support `Premium_LRS` but a `standard_e16d_v5` will not. Premium LRS will offer the best performance.
1. **OS disk size**: The size of the OS disk in GB. This needs to be sufficient to hold every docker container the VM will run plus any logging or further files. If you are not using a machine with attached storage, you will need to increase this for task files (see VM type below). Assuming you are using a machine with attached storage, we can leave this to OS default size.
1. **Container configuration**: Container configuration must be turned on. Do this by switching it from **None** to **Custom**. The type is "Docker compatible" which should be the only available option. This will enable the VM to use Docker images and is sufficient, however we can add further options. Under **Container image names** we can add containers for the VM to grab at startup time. Add a list of fully qualified docker URIs e.g. `quay.io/seqeralabs/nf-launcher:j17-23.04.2`. Under **Container registries**, we can add any container registries which require additional authentication. Click **Container registries** then **Add**. Here you can add a registry username, password and Registry server. If you attached the Managed Identity earlier, you can select this as an authentication method which will allow you to avoid using a username and password.
1. **VM size**: This is the size of the VM. See [the section on Azure VM sizes][az-vm-sizes] for more information.
1. **Scale**: Azure Node pools can be fixed in size or autoscale based on a formula. We recommend autoscaling to enable scaling your resources down to zero when not in use. Click **Auto scale**. Change the **AutoScale evaluation interval** to 5 minutes, this is the minimum period between evaluations of the autoscale formula. For formula, you can use any valid formula, please see the documentation [here][az-batch-autoscale] for more information. This is the default autoscaling formula, with a maximum of 1 VM:
```
// Compute the target nodes based on pending tasks.
// $PendingTasks == The sum of $ActiveTasks and $RunningTasks
$samples = $PendingTasks.GetSamplePercent(interval);
$tasks = $samples < 70 ? max(0, $PendingTasks.GetSample(1)) : max( $PendingTasks.GetSample(1), avg($PendingTasks.GetSample(interval)));
$targetVMs = $tasks > 0 ? $tasks : max(0, $TargetDedicatedNodes/2);
targetPoolSize = max(0, min($targetVMs, 1));
// For first interval deploy 1 node, for other intervals scale up/down as per tasks.
$TargetDedicatedNodes = targetPoolSize;
$NodeDeallocationOption = taskcompletion;
```
1. **Start task**: This is the task that will run on each VM when it joins the pool. This can be used to install additional software on the VM.When using Batch Forge, this is used to install `azcopy` for staging files onto and off the node. Select **Enabled** and add the following command line to install `azcopy`:

```shell
bash -c "chmod +x azcopy && mkdir $AZ_BATCH_NODE_SHARED_DIR/bin/ && cp azcopy $AZ_BATCH_NODE_SHARED_DIR/bin/"
```

Click **Resource files** then select *Http url**. For the URL, add `https://nf-xpack.seqera.io/azcopy/linux_amd64_10.8.0/azcopy` and for File path type `azcopy`. Every other setting can be left default.

:::note
When not using Fusion, every node **must** have `azcopy` installed.
:::

1. **Task Slots**: Set task slots to the number of vCPUs the machine has, e.g. select `4` for a `Standard_D4_v3` VM size.
1. **Task scheduling policy**: This can be set to `Pack` or `Spread`. `Pack` will attempt to schedule tasks from the same job on the same VM, while `Spread` will attempt to distribute tasks evenly across VMs.
1. **Virtual Network**: If using a virtual network, you can select it here. Be sure to select the correct virtual network and subnet. Be aware the virtual machines are required to:
- Pull containers from the relevant container registry (e.g. quay.io, docker.io)
- Copy data from and to Azure Storage using `azcopy`
- Communicate with the head node (running Nextflow) and Seqera Platform to relay log files and information.
As such, very restrictive networking may prevent pipelines running successfully.
1. **Mount configuration**: Nextflow *only* supports Azure File Shares. Select `Azure Files Share`, then add the following details:
- **Source**: Use the URL in the format https://${accountName}.file.core.windows.net/${fileShareName}
- **Relative mount path**: The path to the directory where the file share will be mounted on the VM.
- **Add the Storage account name** and **Storage account key** (managed identity is not supported).

Leave the node pool to start and create a single Azure VM. Monitor to make sure the VM correctly starts and if it raises an error be sure to check it and correct any mistakes. This may require you to create a new Azure Node pool.

**Create a manual Seqera Azure Batch compute environment**

1. In a workspace, select **Compute Environments > New Environment**.
Expand Down Expand Up @@ -299,10 +242,6 @@ Leave the node pool to start and create a single Azure VM. Monitor to make sure
[az-learn-jobs]: https://learn.microsoft.com/en-us/azure/batch/jobs-and-tasks
[az-create-rg]: https://portal.azure.com/#create/Microsoft.ResourceGroup
[az-create-storage]: https://portal.azure.com/#create/Microsoft.StorageAccount-ARM
[az-vm-gen]: https://learn.microsoft.com/en-us/azure/virtual-machines/generation-2
[az-disk-type]: https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types
[az-batch-autoscale]: https://learn.microsoft.com/en-us/azure/batch/batch-automatic-scaling
[az-file-shares]: https://www.nextflow.io/docs/latest/azure.html#azure-file-shares

[wave-docs]: https://docs.seqera.io/wave
[nf-fusion-docs]: https://www.nextflow.io/docs/latest/fusion.html
Loading

0 comments on commit 139a8c8

Please sign in to comment.