oracle-samples · jrgauthier01 · Jan 25, 2022 · Oct 7, 2021 · Oct 7, 2021 · Oct 14, 2021
diff --git a/labs/MLSummit21/README.md b/labs/MLSummit21/README.md
@@ -38,6 +38,10 @@ In [Lab 3](./lab-3-python-model.md) you build, train, and evaluate a simple scik
 
 In [Lab 4](./lab-4-model-catalog.md) we walk you through the metadata that is available in the model catalog as well as some of the key functionalities. 
 
+## (Optional) Lab 4.5: Executing a Training Job
+
+IN [Lab 4.5](./lab-45-training-job.md) we walk you through the process of executing a [Data Science Job](https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm) from a notebook session. It is the same training script as in Lab 3. 
+
 ## Lab 5: Deploying Your Model 
 
 In [Lab 5](./lab-5-model-deploy.md) you deploy your model as an HTTP endpoint using the Model Deployment feature of OCI Data Science. Two different approaches are shown: through the ADS library and directly in the OCI console. 
@@ -53,3 +57,4 @@ In [Lab 7](./lab-7-wrap.md) we wrap up the workshop.
 
 
 Enjoy the workshop :) ! 
+
diff --git a/labs/MLSummit21/images/confirm-kernel.png b/labs/MLSummit21/images/confirm-kernel.png
diff --git a/labs/MLSummit21/images/copy-install-tf-env-command.png b/labs/MLSummit21/images/copy-install-tf-env-command.png
diff --git a/labs/MLSummit21/images/select-tf-env.png b/labs/MLSummit21/images/select-tf-env.png
diff --git a/labs/MLSummit21/lab-0-tenancy-setup.md b/labs/MLSummit21/lab-0-tenancy-setup.md
@@ -15,6 +15,8 @@ In this first lab, you will:
 
 Sign up [here](https://www.oracle.com/cloud/free/). 
 
+:exclamation: :exclamation: :exclamation: **If you already have a tenancy make sure that you have not exhausted the Free Trial credits**. If you have exhausted the credits or your tenancy is older than 30 days, you will only have access to "Always Free" Services. **OCI Data Science is not yet among the "Always Free" offerings.** You will have to convert your tenancy to a paid tenancy or use a different tenancy.   
+
 ## **STEP 2:** Run the Data Science Stack Template 
 
 We have created a Terraform script that can be executed throught the Resource Manager Stack resource. This Terraform script creates the basic user groups, policies, dynamic groups, networking (VCN and subnets) required to create projects and notebook sessions. The Stack also allows you to optionally launch a notebook session after teh setup is completed. We recommend that you create the notebook session. 

diff --git a/labs/MLSummit21/lab-1-notebook-setup.md b/labs/MLSummit21/lab-1-notebook-setup.md
@@ -61,33 +61,25 @@ In this lab you are creating a notebook session. **This step is optional if the
 1. You will notice that the notebook session emits four metrics (CPU Utilization, Memory Utilization, Network Receive and Transmit Bytes) and is integrated with OCI Monitoring. In a separate lab you will learn how to trigger alarms when those metrics reach certain pre-defined thresholds.
     ![](./images/notebook-monitoring.png)    
 
-## **STEP 2**: Copy The Content of this Repository to Your Notebook Session 
-
-1. Download a zip archive of this repository to your laptop/local machine. Make sure that you select the **master** branch
-
-![](./images/github-zip-repo.png) 
+## **STEP 2**: Clone this Repository to Your Notebook Session 
 
 1. Open your notebook session. Click on "Open".
 
 ![](./images/ns-open.png)
 
-1. Drag and drop the zip archive in the JupyterLab file browser. 
-
-![](./images/drag-and-drop-zip-file.png)
-
 1. Open a Terminal window. 
 
 ![](./images/open-terminal.png)
 
 1. Execute the following command in the terminal window: 
 
   ```
-  unzip oci-data-science*.zip
+  git clone https://github.com/oracle/oci-data-science-ai-samples.git lab
+  ```
+
+1. You should see the `lab` folder in the JupyterLab file browser window on the left. The content of this lab is under: 
+  ```
+  /home/datascience/lab/labs/MLSummit21/
   ```
-  This command will unzip the file. 
-
-1. Open the newly created folder and navigate to this lab folder. 
-
-Alternatively for Step 2, you can use `git clone` command in the terminal window of JupyterLab to clone the content of this repo. Make sure you create private/public ssh key pairs for this and register the public key in your github user settings. 
 
 **Congratulations! You are now ready to proceed to the next lab.**
diff --git a/labs/MLSummit21/lab-2-install-conda.md b/labs/MLSummit21/lab-2-install-conda.md
@@ -54,7 +54,7 @@ Before you can use a conda environment in your notebook session, you need to ins
   1. In the *Launcher* tab, click **Environment Explorer**
     ![](./images/notebook_launcher.png)
 
-  1. In the Environment Explorer tab, select the **Data Science Conda Environment** filter button, select **CPU** architecture filter, then scroll down until you find the **TensorFlow 2.6 for CPU Python 3.7** conda. (If you see no results, use the refresh button on the right side of the filter bar of the Environment Explorer.)
+  1. In the Environment Explorer tab, select the **Data Science Conda Environment** filter button, select **CPU** architecture filter, then scroll down until you find the **TensorFlow 2.7 for CPU on Python 3.7** conda. (If you see no results, use the refresh button on the right side of the filter bar of the Environment Explorer.)
     ![](./images/select-tf-env.png)
 
   1. Click on the caret on the right side, copy the install command 
@@ -65,7 +65,7 @@ Before you can use a conda environment in your notebook session, you need to ins
   1. **Paste the command** into the terminal window and hit **Return** to execute it. 
      The command that you previously copied is:  
       ```
-      odsc conda install -s tensorflow26_p37_cpu_v1 
+      odsc conda install -s tensorflow27_p37_cpu_v1 
       ```
 
   1. You will receive a prompt related to what version number you want. Press `Enter` to select the default.

diff --git a/labs/MLSummit21/lab-3-python-model.md b/labs/MLSummit21/lab-3-python-model.md
@@ -31,14 +31,13 @@ In this lab you will:
 
 A notebook has been prepared containing all the necessary Python code to explore the data, train the model, evaluate the model, and store it in the model catalog. This notebook has already been configured with a conda environment.
 
-  1. In the file browser, navigate to the directory **~/oci-data-science-ai-samples-master/labs/MLSummit21/Notebooks/**. This directory was created in Lab 1 when you unzip this repository in your notebook session. 
+  1. In the file browser, navigate to the directory **/home/datascience/lab/labs/MLSummit21/Notebooks/**. This directory was created in Lab 1 when you unzip this repository in your notebook session. 
 
   1. Open the notebook **1-model-training.ipynb** (double-click on it). A new tab opens in the workspace on the right.
 
-     Notice in the upper right corner of the notebook tab, it displays the name of the conda environment being used by this notebook. Confirm that the name you see the slugname of the TensorFlow conda environment (`tensorflow26_p37_cpu_v1`)
+     Notice in the upper right corner of the notebook tab, it displays the name of the conda environment being used by this notebook. Confirm that the name you see the slugname of the TensorFlow conda environment (`tensorflow27_p37_cpu_v1`)
 
   ![](./images/confirm-kernel.png)
 
   1. Now you will work in the notebook. Scroll through each cell and read the explanations. When you encounter a `code` cell, execute it (using **shift + enter**) and view the results. For executable cells, the ""[ ]"" changes to a "[\*]" while executing, then a number when complete "[1]". (If you run short on time, you can use the *Run* menu to run the remaining cells and the review the results.) 
-
 **You can proceed to the next lab.**
diff --git a/labs/MLSummit21/lab-45-training-job.md b/labs/MLSummit21/lab-45-training-job.md
@@ -0,0 +1,52 @@
+# Lab 4.5 - Executing a Training Job
+
+## Introduction
+
+[Data Science Jobs](https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm) enable custom tasks because you can apply any use case you have, such as data preparation, model training, hyperparameter tuning, batch inference, and so on.
+
+Using jobs, you can:
+
+* Run machine learning (ML) or data science tasks outside of your notebook sessions in JupyterLab.
+* Operationalize discrete data science and machine learning tasks as reusable runnable operations.
+* Automate your typical MLOps or CI/CD pipeline.
+* Execute batches or workloads triggered by events or actions.
+* Batch, mini batch, or distributed batch job inference.
+
+After the steps are completed, you can automate the process of data exploration, model training, deploying and testing using jobs. A single change in the data preparation or model training, experiments with hyperparameter tunings could be run as Job and independently tested.
+
+Jobs are two parts, a job and a job run:
+
+### Job
+A job is template that describes the task. It contains elements like the job artifact that is immutable and can't be modified after it's uploaded to a job. Also, the job contains information about the Compute shapes the job runs on, logging options, block storage, and other options. You can add environment variables or CLI arguments to jobs to be unique or similar for all your future job runs. You can override these variables and arguments in job runs.
+
+You can edit the Compute shape in the job and between job runs. For example, if you notice that you want to execute a job run on more powerful shape, you can edit the job Compute shape, and then start a new job run.
+
+### Job Run
+A job run is the actual job processor. In each job run, you can override some of the job configuration, and most importantly the environment variables and CLI arguments. You can have the same job with several sequentially or simultaneously started job runs with different parameters. For example, you could experiment with how the same model training process performs by providing different hyperparameters.
+
+Estimated Lab Time: 10 minutes
+
+## Objectives
+In this lab, you will:
+* Use ADS to define a Data Science Job 
+* Execute and monitor the progress of your Job Run. 
+
+## Prerequisites
+
+* Successful completion of Labs 0, 1, 2, and 3. 
+
+## STEP 1: Execute the notebook `1.5-(optional)-model-training-job.ipynb`
+
+A notebook has been prepared containing all the necessary Python code to train and save the same machine learning model as in lab 3 but this time we will run the training script as a Data Science Job. 
+
+  1. In the file browser, navigate to the directory **/home/datascience/lab/labs/MLSummit21/Notebooks/**. This directory was created in Lab 1 when you unzip this repository in your notebook session. 
+
+  1. Open the notebook **1.5-(optional)-model-training-job.ipynb** (double-click on it). A new tab opens in the workspace on the right.
+
+     Notice in the upper right corner of the notebook tab, it displays the name of the conda environment being used by this notebook. Confirm that the name you see the slugname of the TensorFlow conda environment (`tensorflow27_p37_cpu_v1`)
+
+  ![](./images/confirm-kernel.png)
+
+  1. Now you will work in the notebook. Scroll through each cell and read the explanations. When you encounter a `code` cell, execute it (using **shift + enter**) and view the results. For executable cells, the ""[ ]"" changes to a "[\*]" while executing, then a number when complete "[1]". (If you run short on time, you can use the *Run* menu to run the remaining cells and the review the results.) 
+
+**Congratulations! You are now ready to proceed to the next lab.**
diff --git a/labs/MLSummit21/lab-5-model-deploy.md b/labs/MLSummit21/lab-5-model-deploy.md
@@ -13,7 +13,7 @@ In this lab you will:
 
 ## **STEP 1:** Open and Run the Notebook `2-model-deployment.ipynb`
 
-1. In the **~/oci-data-science-ai-samples-master/labs/MLSummit21/notebooks** directory of your notebook session, open the notebook `2-model-deployment.ipynb`
+1. In the **/home/datascience/lab/labs/MLSummit21/notebooks** directory of your notebook session, open the notebook `2-model-deployment.ipynb`
 
 1. Follow the instructions in the notebook
 

diff --git a/labs/MLSummit21/notebooks/1-model-training.ipynb b/labs/MLSummit21/notebooks/1-model-training.ipynb
@@ -37,7 +37,16 @@
    "source": [
     "Let's do all of the imports necessary to get this notebook working up here.\n",
     "\n",
-    "**NOTE: Double-check that this notebook is running in the `tensorflow26_p37_cpu_v1` conda kernel&&"
+    "**<font color='red'>NOTE: This notebook was run in the TensorFlow 2.7 for CPU (slug: `tensorflow27_p37_cpu_v1`) conda environment with ADS version 2.5.6. Upgrade your version of ADS (see cell below) and restart your kernel.</font>**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install oracle-ads==2.5.6"
    ]
   },
   {
@@ -112,6 +121,18 @@
     "print(ads.__version__)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The code cell below will work in you are in the **Ashburn** region.  \n",
+    "\n",
+    "The file is also available publicly at this url: \n",
+    "https://objectstorage.us-ashburn-1.oraclecloud.com/n/bigdatadatasciencelarge/b/hosted-ds-datasets/o/synthetic%2Forcl_attrition.csv\n",
+    "\n",
+    "You can download it and drop it in the file browser of JupyterLab."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -122,7 +143,7 @@
     "namespace = \"bigdatadatasciencelarge\"\n",
     "employees = DatasetFactory.open(\n",
     "        \"oci://{}@{}/synthetic/orcl_attrition.csv\".format(bucket_name, namespace), \n",
-    "    target=\"Attrition\").set_positive_class('Yes')"
+    "    target=\"Attrition\", storage_options={'config':{},'region':'us-ashburn-1'}).set_positive_class('Yes')"
    ]
   },
   {
@@ -146,7 +167,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "employees.show_in_notebook()"
+    "#employees.show_in_notebook()"
    ]
   },
   {
@@ -155,7 +176,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "employees.show_corr()"
+    "#employees.show_corr()"
    ]
   },
   {
@@ -359,20 +380,23 @@
     "from ads.common.model_artifact import ModelArtifact\n",
     "from ads.common.model_export_util import prepare_generic_model\n",
     "import joblib \n",
+    "import os\n",
     "\n",
     "# Path to artifact directory for my sklearn model: \n",
-    "sklearn_path = \"./model-artifact/\"\n",
+    "model_artifact_location = os.path.expanduser('./model-artifact/')\n",
+    "os.makedirs(model_artifact_location, exist_ok=True)\n",
+    "\n",
+    "# Creating a joblib pickle object of my random forest model: \n",
+    "joblib.dump(sk_model, os.path.join(model_artifact_location, \"model.joblib\"))\n",
     "\n",
     "# Creating the artifact template files in the directory: \n",
-    "sklearn_artifact = prepare_generic_model(sklearn_path, \n",
-    "                                         inference_conda_env=\"oci://service-conda-packs@id19sfcrra6z/service_pack/cpu/TensorFlow 2.6 for CPU Python 3.7/1.0/tensorflow26_p37_cpu_v1\",\n",
+    "sklearn_artifact = prepare_generic_model(model_artifact_location, \n",
+    "                                         inference_conda_env=\"oci://service-conda-packs@id19sfcrra6z/service_pack/cpu/TensorFlow 2.7 for CPU on Python 3.7/1.0/tensorflow27_p37_cpu_v1\",\n",
     "                                         force_overwrite=True,\n",
+    "                                         model='model.joblib',\n",
     "                                         use_case_type='BINARY_CLASSIFICATION',\n",
     "                                         X_sample=train.X,\n",
-    "                                         y_sample=train.y)\n",
-    "\n",
-    "# Creating a joblib pickle object of my random forest model: \n",
-    "joblib.dump(sk_model, os.path.join(sklearn_path, \"model.joblib\"))"
+    "                                         y_sample=train.y)"
    ]
   },
   {
@@ -392,8 +416,8 @@
    "source": [
     "#setting paths for artifact files that need to be modified: \n",
     "\n",
-    "encoder_path = os.path.join(sklearn_path, \"dataframelabelencoder.py\")\n",
-    "score_path = os.path.join(sklearn_path, \"score.py\")\n",
+    "encoder_path = os.path.join(model_artifact_location, \"dataframelabelencoder.py\")\n",
+    "score_path = os.path.join(model_artifact_location, \"score.py\")\n",
     "!cp dataframelabelencoder.py {encoder_path}"
    ]
   },
@@ -466,8 +490,8 @@
     "    assert model is not None, \"Model is not loaded\"\n",
     "    X = pd.read_json(io.StringIO(data)) if isinstance(data, str) else pd.DataFrame.from_dict(data)\n",
     "    preds = model.predict(X).tolist()\n",
-    "    #logger_pred.info(preds)\n",
-    "    #logger_feat.info(X)    \n",
+    "#    logger_pred.info(preds)\n",
+    "#    logger_feat.info(X)    \n",
     "    return { 'prediction': preds }"
    ]
   },
@@ -498,7 +522,7 @@
     "import sys \n",
     "\n",
     "# add the path of score.py: \n",
-    "sys.path.insert(0, sklearn_path)\n",
+    "sys.path.insert(0, model_artifact_location)\n",
     "\n",
     "from score import load_model, predict\n",
     "\n",
@@ -528,7 +552,8 @@
     "mc_model = sklearn_artifact.save(project_id=os.environ['PROJECT_OCID'], \n",
     "                               compartment_id=os.environ['NB_SESSION_COMPARTMENT_OCID'], \n",
     "                               training_id=os.environ['NB_SESSION_OCID'],\n",
-    "                               display_name=\"sklearn-employee-attrition\",\n",
+    "                               display_name=\"attrition-model\",\n",
+    "                               ignore_introspection=False,\n",
     "                               description=\"simple sklearn model to predict employee attrition\", \n",
     "                               training_script_path=\"1-model-training.ipynb\", \n",
     "                               ignore_pending_changes=True)"
@@ -554,9 +579,9 @@
  "metadata": {
   "celltoolbar": "Raw Cell Format",
   "kernelspec": {
-   "display_name": "Python [conda env:tensorflow26_p37_cpu_v1]",
+   "display_name": "Python [conda env:tensorflow27_p37_cpu_v1]",
    "language": "python",
-   "name": "conda-env-tensorflow26_p37_cpu_v1-py"
+   "name": "conda-env-tensorflow27_p37_cpu_v1-py"
   },
   "language_info": {
    "codemirror_mode": {
@@ -568,7 +593,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.7.10"
+   "version": "3.7.12"
   },
   "pycharm": {
    "stem_cell": {