Skip to content

Latest commit

 

History

History
94 lines (61 loc) · 7.1 KB

llama2_fine_tuning_aml_dashboard.md

File metadata and controls

94 lines (61 loc) · 7.1 KB

Fine-Tuning Llama-2 Models - A Dashboard Experience

Learn how to fine-tune an Llama-2 model using Azure Machine Learning (AML) Studio - UI Dashboard.

Prerequisites

  • Learn the what, why, and when to use fine-tuning.
  • An Azure subscription.
  • Access to AML Service.
  • An AML resource created.
  • Prepare Training and Validation datasets:
    • at least 50 high-quality samples (preferably 1,000s) are required.
    • must be formatted in the JSON Lines (JSONL) document with UTF-8 encoding.

Step 1: Open the Model catalog wizard

  1. Open Azure Machine Learning Studio at https://ml.azure.com/ and sign in with credentials that have access to AML resource. During the sign-in workflow, select the appropriate directory, Azure subscription, and AML resource.

  2. In AML Studio, browse to the Model catalog pane.

    Screenshot of AML Model Catalog pane.
  1. In the search box, type llama2.
    Screenshot of AML Model Catalog pane, searching for llama2 in the search box.

Step 2: Start the fine-tuning process

Assume that you want to fine-tune the llama-2-7b model for a text generation task (similar process for chat-completion tasks).

  1. The first step is to press the Fine-tune button to start the fine-tuning process.
    Screenshot of AML Model Catalog, with llama2-7b model description page.
  1. The Fine-tune Llama-2-7b blade lets you specify task type (choose Text generation for our case), training data, validation data (optional), test data (optional), and an Azure ML compute cluster.
    Screenshot of AML Model Catalog pane, for llama2-7b model, opening the Fine-Tune blade.

Step 3: Create an Azure ML compute cluster

To run the fine-tuning job, an AML compute cluster machine needs to be created (if you haven't done it before).

  1. The + New button at the bottom of the blade opens the Create compute cluster pane, where you need to specify the Location (e.g. West Europe), Virtual machine tier (Dedicated), Virtual machine type (GPU) and Virtual machine size.
    Screenshot of AML create compute cluster pane, with location on west europe and the gpu type of nvidia ND40 series machine.
    Note that only nvidia ND40 and ND96 VMs are supported for fine-tuning at the moment. If you can't find it in list, you can try choosing other Location or to request quota accordingly.
  1. Give a name to the compute, and specify the minimum (usually 0) and maximum (1 for testing purpose) number of nodes.
    Screenshot of AML create compute cluster pane, with gpu advanced config pane.
    Click Next to start the creation process. This may take a couple of minutes.

Step 4: Choose your Training data

The next step is to select your training data either from the previously uploaded one or by uploading a new one.

    Screenshot of AML fine tuning - choose training / validation / test data.

You also need to specify the 'prompt' (i.e. input) and the 'completion' (i.e. output) columns to guide the fine-tuning process.

    Screenshot of AML fine tuning - choose training / validation / test data, and map the prompt and completion columns.

Step 5 (Optional): Choose your Validation data

You can select your validation data by following the similar procedure as you do for the training data. Or, you can leave it as the default setting (i.e. an automtic split of the training data will be used for validation).

Step 6 (Optional): Choose your Test data

You can select your test data by following the similar procedure as you do for the training data. Or, you can leave it as the default setting (i.e. an automtic split of the training data will be used for testing).

Step 7: Submit your fine-tuning job

Now that you are ready. Click the Finish button at the bottom of the Fine-tune Llama-2-7b blade. This will trigger the actual fine-tuning process to start. Depending on the size of your training data, this process can take from minutes to hours.

    Screenshot of AML fine tuning - fine tuning jobs running.

After the fine-tuning job finishes, its Status becomes Completed.

    Screenshot of AML fine tuning - fine tuning jobs completed.

Step 8: Deploy the fine-tuned model

Before deploying the model, you need to register the model first.

Go to Assets > Models pane, select the newly fine-tuned model, and click + Register.

    Screenshot of AML fine tuning - register the fine-tuned model.

After that, click the + Deploy button to invoke the Deployment blade, where you need to specify the Virtual machine (preferably choose nvidia NC & ND VM series), Instance count, Endpoint name and Deployment name.

    Screenshot of AML fine tuning - deploy the fine-tuned model.

Click the Deploy button at the bottom to start the actual deployment process.

This may take a moment, until you see both Provisioning state become Succeeded.

    Screenshot of AML fine tuning - deploy the fine-tuned model - succeeded.

Step 9: Test and use a deployed model

You can directly test the deployed model via the handy test playground.

    Screenshot of testing a deployed fine-tuned model via the handy test playground.

You can also consume the API using a popular programming language such as Python.

    Screenshot of testing a deployed fine-tuned model via API calls.

Step 10 (Optional): Clean up your deployment resources

When you're done with your custom model, you can delete the deployed endpoint, model, and the compute cluster.

You can also delete the training (and validation and test) files you uploaded to the service, if needed.