Skip to content

Commit

Permalink
Merge pull request aws#2 from mchoi8739/e2e-fraud-detection
Browse files Browse the repository at this point in the history
polish up the notebooks
  • Loading branch information
aarsanjani authored Feb 10, 2021
2 parents 9689894 + 38635c7 commit 1143283
Show file tree
Hide file tree
Showing 23 changed files with 1,242 additions and 449 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@
"metadata": {},
"outputs": [],
"source": [
"!python -m pip install shap"
"%conda install -c conda-forge shap"
]
},
{
Expand Down Expand Up @@ -124,7 +124,7 @@
"metadata": {},
"outputs": [],
"source": [
"automl_job_name = '<your_automl_job_name_here>'\n",
"automl_job_name = 'your-autopilot-job-that-exists'\n",
"automl_job = AutoML.attach(automl_job_name, sagemaker_session=session)\n",
"\n",
"# Endpoint name\n",
Expand Down Expand Up @@ -460,4 +460,4 @@
},
"nbformat": 4,
"nbformat_minor": 4
}
}
1 change: 1 addition & 0 deletions aws_marketplace/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ These examples show you how to use model-packages and algorithms from AWS Market
- [Using Algorithms](using_algorithms)
- [Using Algorithm From AWS Marketplace](using_algorithms/amazon_demo_product) provides a detailed walkthrough on how to use Algorithm with the enhanced SageMaker Train/Transform/Hosting/Tuning APIs by choosing a canonical product listed on AWS Marketplace.
- [Using AutoML algorithm](using_algorithms/automl) provides a detailed walkthrough on how to use AutoML algorithm from AWS Marketplace.
- [Using Implicit BPR Algorithm](using_algorithms/implicit_bpr) provides a detailed walkthrough on how to build a recommender system for implicit feedback datasets to train, evaluate and host your model to perform the batch and real-time inferences.

- [Using Model Packages](using_model_packages)
- [Using Model Packages From AWS Marketplace](using_model_packages/generic_sample_notebook) is a generic notebook which provides sample code snippets you can modify and use for performing inference on Model Packages from AWS Marketplace, using Amazon SageMaker.
Expand Down
9 changes: 9 additions & 0 deletions aws_marketplace/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,15 @@ AutoML
using_algorithms/automl/AutoML_-_Train_multiple_models_in_parallel


ImplicitBPR
------

.. toctree::
:maxdepth: 0

using_algorithms/implicit_bpr/recommender_system_with_implicit_bpr


Use AWS Marketplace model packages
==================================

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
This folder will be used for downloading and storing the original dataset, data for training, testing and the batch requests payload that will be used to analyze further and train our model
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Large diffs are not rendered by default.

172 changes: 93 additions & 79 deletions end_to_end/0-AutoClaimFraudDetection.ipynb

Large diffs are not rendered by default.

111 changes: 39 additions & 72 deletions end_to_end/1-data-prep-e2e.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,62 +4,54 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## SageMaker End to End Solutions : Fraud Detection for Automobile Claims\n",
"\n",
"# Part 1 : Data Prep to Feature Store "
"# Part 1 : Data Preparation, Process, and Store Features"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The purpose of this notebook is to perform the Data Prep phase of the ML-lifecycle. The main Data Wrangling, ingestion and multiple transformations have been done in the SageMaker Studio DataWrangler GUI . [See Video here](#link-to-video)\n",
"So in this notebook we will take the .flow files that define the transformations to the raw data and apply them using a SageMaker Processing job that will apply those transformations to the raw data deposited in the S3 bucket as .csv files."
"<a id='all-up-overview'></a>\n",
"\n",
"## [Overview](./0-AutoClaimFraudDetection.ipynb)\n",
"* [Notebook 0: Overview, Architecture and Data Exploration](./0-AutoClaimFraudDetection.ipynb)\n",
"* **[Notebook 1: Data Preparation, Process, and Store Features](./1-data-prep-e2e.ipynb)**\n",
" * **[Architecture](#arch)**\n",
" * **[Getting started](#aud-getting-started)**\n",
" * **[DataSets](#aud-datasets)**\n",
" * **[SageMaker Feature Store](#aud-feature-store)**\n",
" * **[Create train and test datasets](#aud-dataset)**\n",
"* [Notebook 2: Train, Check Bias, Tune, Record Lineage, and Register a Model](./2-lineage-train-assess-bias-tune-registry-e2e.ipynb)\n",
"* [Notebook 3: Mitigate Bias, Train New Model, Store in Registry](./3-mitigate-bias-train-model2-registry-e2e.ipynb)\n",
"* [Notebook 4: Deploy Model, Run Predictions](./4-deploy-run-inference-e2e.ipynb)\n",
"* [Notebook 5: Create and Run an End-to-End Pipeline to Deploy the Model](./5-pipeline-e2e.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id='all-up-overview'></a>\n",
"The purpose of this notebook is to perform the Data Prep phase of the ML life cycle. The main Data Wrangling, data ingestion, and multiple transformations will be done through the SageMaker Studio Data Wrangler GUI ([See Video here](#link-to-video)).\n",
"\n",
"## [Overview](./0-AutoClaimFraudDetection.ipynb)\n",
"* ### [Notebook 0 : Overview, Architecture and Data Exploration](./0-AutoClaimFraudDetection.ipynb)\n",
"* ### [Notebook 1: Data Prep, Process, Store Features](./1-data-prep-e2e.ipynb)\n",
" * #### [Architecture](#arch)\n",
" * #### [Getting started](#aud-getting-started)\n",
" * #### [DataSets](#aud-datasets)\n",
" * #### [SageMaker Feature Store](#aud-feature-store)\n",
" * #### [Create train and test datasets](#aud-dataset)\n",
"* ### [Notebook 2: Train, Check Bias, Tune, Record Lineage, Register Model](./2-lineage-train-assess-bias-tune-registry-e2e.ipynb)\n",
" * #### Train a model using XGBoost\n",
" * #### Model lineage with artifacts and associations\n",
" * #### Evaluate the model for bias with Clarify\n",
" * #### Deposit Model and Lineage in SageMaker Model Registry\n",
"* ### [Notebook 3: Mitigate Bias, Train New Model, Store in Registry](./3-mitigate-bias-train-model2-registry-e2e.ipynb)\n",
" * #### Train a version 2.0 model\n",
"* ### [Notebook 4: Deploy Model, Run Predictions](./4-deploy-run-inference-e2e.ipynb)\n",
" * #### Deploy an approved model and make prediction\n",
"* ### [Notebook 5 : Create and Run an end to end Pipeline to Deploy the Model]((./5-pipeline-e2e.ipynb))\n",
" * #### SageMaker Pipeline\n",
" * #### Cleanup"
"In this notebook, we will take the `.flow` files that define the transformations to the raw data. and apply them using a SageMaker Processing job that will apply those transformations to the raw data deposited in the S3 bucket as `.csv` files."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id='arch'> </a>\n",
"### Architecture for Data Prep, Process and Store Features\n",
"## Architecture for Data Prep, Process and Store Features\n",
"[overview](#all-up-overview)\n",
"___\n",
"![Data Prep and Store](./images/e2e-1-pipeline-v3b.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Loading stored variables\n",
"If you ran this notebook before, you may want to re-use the resources you aready created with AWS. Run the cell below to load any prevously created variables. You should see a print-out of the existing variables. If you don't see anything printed then it's probably the first time you are running the notebook! "
"### Install required and/or update third-party libraries"
]
},
{
Expand All @@ -68,15 +60,16 @@
"metadata": {},
"outputs": [],
"source": [
"%store -r\n",
"%store"
"!python -m pip install -Uq pip\n",
"!python -m pip install -q awswrangler==2.2.0 imbalanced-learn==0.7.0 sagemaker==2.23.1 boto3==1.16.48"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Install required and/or update third-party libraries"
"### Loading stored variables\n",
"If you ran this notebook before, you may want to re-use the resources you aready created with AWS. Run the cell below to load any prevously created variables. You should see a print-out of the existing variables. If you don't see anything printed then it's probably the first time you are running the notebook! "
]
},
{
Expand All @@ -85,17 +78,15 @@
"metadata": {},
"outputs": [],
"source": [
"!python -m pip install -Uq pip\n",
"!python -m pip install -q awswrangler==2.2.0 imbalanced-learn==0.7.0 sagemaker==2.23.1 boto3==1.16.48\n"
"%store -r\n",
"%store"
]
},
{
"cell_type": "code",
"execution_count": null,
"cell_type": "markdown",
"metadata": {},
"outputs": [],
"source": [
"!python -m pip install -q --upgrade sagemaker boto3"
"**<font color='red'>Important</font>: You must have run the previous sequancial notebooks to retrieve variables using the StoreMagic command.**"
]
},
{
Expand Down Expand Up @@ -195,8 +186,7 @@
"\n",
"sagemaker_session = sagemaker.session.Session(\n",
" boto_session=boto_session,\n",
" sagemaker_client=sagemaker_boto_client)\n",
"\n"
" sagemaker_client=sagemaker_boto_client)"
]
},
{
Expand Down Expand Up @@ -348,9 +338,9 @@
"metadata": {},
"source": [
"<a id='aud-datasets'></a>\n",
"#### DataSets and Feature Types\n",
"\n",
"[overview](#all-up-overview)"
"## DataSets and Feature Types\n",
"[overview](#all-up-overview)\n",
"___"
]
},
{
Expand Down Expand Up @@ -676,15 +666,6 @@
"print('\\nData available.')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<pre>\n",
"\n",
"</pre>"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -787,15 +768,6 @@
"%store test_data_uri"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<pre>\n",
"\n",
"</pre>"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -815,18 +787,13 @@
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"cell_type": "markdown",
"metadata": {},
"outputs": [],
"source": []
"source": [
"___\n",
"\n",
"### Next Notebook: [Train, Check Bias, Tune, Record Lineage, Register Model](./2-lineage-train-assess-bias-tune-registry-e2e.ipynb)"
]
},
{
"cell_type": "code",
Expand Down
Loading

0 comments on commit 1143283

Please sign in to comment.