Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated: Minor messaging in several notebooks #78

Merged
merged 1 commit into from
Nov 27, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
"---\n",
"## Background\n",
"\n",
"Amazon SageMaker includes functionality to support a hosted notebook environment, distributed, managed training, and real-time, autoscaling hosting. We think it works best when all three of these services are used together, but they can also be used independently. Some use cases may only require hosting. Maybe the model was trained prior to Amazon SageMaker existing, in a different service.\n",
"Amazon SageMaker includes functionality to support a hosted notebook environment, distributed, managed training, and real-time hosting. We think it works best when all three of these services are used together, but they can also be used independently. Some use cases may only require hosting. Maybe the model was trained prior to Amazon SageMaker existing, in a different service.\n",
"\n",
"This notebook shows how to use a pre-existing model with an Amazon SageMaker Algorithm container to quickly create a hosted endpoint for that model.\n",
"\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
"---\n",
"## Background\n",
"\n",
"Amazon SageMaker includes functionality to support a hosted notebook environment, distributed, serverless training, and real-time hosting. We think it works best when all three of these services are used together, but they can also be used independently. Some use cases may only require hosting. Maybe the model was trained prior to Amazon SageMaker existing, in a different service.\n",
"Amazon SageMaker includes functionality to support a hosted notebook environment, distributed, managed training, and real-time hosting. We think it works best when all three of these services are used together, but they can also be used independently. Some use cases may only require hosting. Maybe the model was trained prior to Amazon SageMaker existing, in a different service.\n",
"\n",
"This notebook shows how to use a pre-existing scikit-learn model with the Amazon SageMaker XGBoost Algorithm container to quickly create a hosted endpoint for that model.\n",
"\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -370,7 +370,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's kick off our training job in SageMaker's distributed, serverless training, using the parameters we just created. Because training is serverless, we don't have to wait for our job to finish to continue, but for this case, let's setup a while loop so we can monitor the status of our training."
"Now let's kick off our training job in SageMaker's distributed, managed training, using the parameters we just created. Because training is managed, we don't have to wait for our job to finish to continue, but for this case, let's setup a while loop so we can monitor the status of our training."
]
},
{
Expand Down Expand Up @@ -441,7 +441,7 @@
"source": [
"Once we've setup a model, we can configure what our hosting endpoints should be. Here we specify:\n",
"1. EC2 instance type to use for hosting\n",
"1. Lower and upper bounds for number of instances\n",
"1. Initial number of instances\n",
"1. Our hosting model name"
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@
"1. [Prepration](#Preparation)\n",
"1. [Data](#Data)\n",
" 1. [Exploration and Transformation](#Exploration) \n",
"1. [Training Xgboost model using Sagemaker](#Training)\n",
"1. [Training Xgboost model using SageMaker](#Training)\n",
"1. [Hosting the model](#Hosting)\n",
"1. [Evaluating the model on test samples](#Evaluation)\n",
"1. [Training a second Logistic Regression model using Sagemaker](#Linear-Model)\n",
"1. [Training a second Logistic Regression model using SageMaker](#Linear-Model)\n",
"1. [Hosting the Second model](#Hosting:Linear-Learner)\n",
"1. [Evaluating the model on test samples](#Prediction:Linear-Learner)\n",
"1. [Combining the model results](#Ensemble)\n",
Expand All @@ -35,12 +35,12 @@
"\n",
"This notebook presents an illustrative example to predict if a person makes over 50K a year based on information about their education, work-experience, geneder etc.\n",
"\n",
"* Preparing your _Sagemaker_ notebook\n",
"* Loading a dataset from S3 using Sagemaker\n",
"* Investigating and transforming the data so that it can be fed to _Sagemaker_ algorithms\n",
"* Estimating a model using Sagemaker's XGBoost (eXtreme Gradient Boosting) algorithm\n",
"* Hosting the model on Sagemaker to make on-going predictions\n",
"* Estimating a second model using Sagemaker's -linear learner method\n",
"* Preparing your _SageMaker_ notebook\n",
"* Loading a dataset from S3 using SageMaker\n",
"* Investigating and transforming the data so that it can be fed to _SageMaker_ algorithms\n",
"* Estimating a model using SageMaker's XGBoost (eXtreme Gradient Boosting) algorithm\n",
"* Hosting the model on SageMaker to make on-going predictions\n",
"* Estimating a second model using SageMaker's Linear Learner method\n",
"* Combining the predictions from both the models and evluating the combined prediction\n",
"* Generating final predictions on the test data set\n",
"\n",
Expand Down Expand Up @@ -85,15 +85,6 @@
"Now let's bring in the Python libraries that we'll use throughout the analysis"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!conda install -y -c conda-forge scikit-learn"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -245,7 +236,7 @@
"\n",
"## Training\n",
"\n",
"As our first training algorithm we pick `xgboost` algorithm. `xgboost` is an extremely popular, open-source package for gradient boosted trees. It is computationally powerful, fully featured, and has been successfully used in many machine learning competitions. Let's start with a simple `xgboost` model, trained using `Sagemaker's` serverless, distributed training framework.\n",
"As our first training algorithm we pick `xgboost` algorithm. `xgboost` is an extremely popular, open-source package for gradient boosted trees. It is computationally powerful, fully featured, and has been successfully used in many machine learning competitions. Let's start with a simple `xgboost` model, trained using `SageMaker's` managed, distributed training framework.\n",
"\n",
"First we'll need to specify training parameters. This includes:\n",
"1. The role to use\n",
Expand All @@ -266,7 +257,7 @@
"For csv input, right now we assume the input is separated by delimiter(automatically detect the separator by Python’s builtin sniffer tool), without a header line and also label is in the first column.\n",
"Scoring Output Format: csv.\n",
"\n",
"* Since our data is in CSV format, we will convert our dataset to the way Sagemaker's XGboost supports.\n",
"* Since our data is in CSV format, we will convert our dataset to the way SageMaker's XGboost supports.\n",
"* We will keep the target field in first column and remaining features in the next few columns\n",
"* We will remove the header line\n",
"* We will also split the data into a separate training and validation sets\n",
Expand Down Expand Up @@ -411,7 +402,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's kick off our training job in SageMaker's distributed, serverless training, using the parameters we just created. Because training is serverless, we don't have to wait for our job to finish to continue, but for this case, let's setup a while loop so we can monitor the status of our training."
"Now let's kick off our training job in SageMaker's distributed, managed training, using the parameters we just created. Because training is managed, we don't have to wait for our job to finish to continue, but for this case, let's setup a while loop so we can monitor the status of our training."
]
},
{
Expand Down Expand Up @@ -496,7 +487,7 @@
"source": [
"Once we've setup a model, we can configure what our hosting endpoints should be. Here we specify:\n",
"1. EC2 instance type to use for hosting\n",
"1. Lower and upper bounds for number of instances\n",
"1. Initial number of instances\n",
"1. Our hosting model name"
]
},
Expand Down Expand Up @@ -690,7 +681,7 @@
"source": [
"---\n",
"## Linear-Model\n",
"### Train a second model using Sagemaker's Linear Learner"
"### Train a second model using SageMaker's Linear Learner"
]
},
{
Expand All @@ -699,7 +690,7 @@
"metadata": {},
"outputs": [],
"source": [
"prefix = 'sagemaker/linear' ##subfolder inside the data bucket to be used for linear learner\n",
"prefix = 'sagemaker/linear' ##subfolder inside the data bucket to be used for Linear Learner\n",
"\n",
"data_train = pd.read_csv(\"formatted_train.csv\", sep=',', header=None) \n",
"data_test = pd.read_csv(\"formatted_test.csv\", sep=',', header=None) \n",
Expand Down Expand Up @@ -871,7 +862,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's kick off our training job in SageMaker's distributed, serverless training, using the parameters we just created. Because training is serverless, we don't have to wait for our job to finish to continue, but for this case, let's setup a while loop so we can monitor the status of our training."
"Now let's kick off our training job in SageMaker's distributed, managed training, using the parameters we just created. Because training is managed, we don't have to wait for our job to finish to continue, but for this case, let's setup a while loop so we can monitor the status of our training."
]
},
{
Expand Down Expand Up @@ -936,7 +927,7 @@
"source": [
"Once we've setup a model, we can configure what our hosting endpoints should be. Here we specify:\n",
"1. EC2 instance type to use for hosting\n",
"1. Lower and upper bounds for number of instances\n",
"1. Initial number of instances\n",
"1. Our hosting model name"
]
},
Expand Down Expand Up @@ -1001,7 +992,7 @@
"metadata": {},
"source": [
"### Prediction:Linear-Learner\n",
"#### Predict using Sagemaker's linear learner and evaluate the performance\n",
"#### Predict using SageMaker's Linear Learner and evaluate the performance\n",
"\n",
"Now that we have our hosted endpoint, we can generate statistical predictions from it. Let's predict on our test dataset to understand how accurate our model is on unseen samples using AUC metric."
]
Expand Down Expand Up @@ -1202,9 +1193,9 @@
"## Extensions\n",
"\n",
"This example analyzed a relatively small dataset, but utilized SageMaker features such as,\n",
"* serverless single-machine training of XGboost model \n",
"* serverless training of Linear Learner\n",
"* highly available, autoscaling model hosting, \n",
"* managed single-machine training of XGboost model \n",
"* managed training of Linear Learner\n",
"* highly available, real-time model hosting, \n",
"* doing a batch prediction using the hosted model\n",
"* Doing an ensemble of Xgboost and Linear Learner\n",
"\n",
Expand Down
8 changes: 4 additions & 4 deletions under_development/ensemble_modeling/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
This example notebook shows how to use mutiple models from SageMaker for prediction and combine then into an ensemble prediction.

It demonstrates the following:
* Basic setup for using Sagemaker.
* Basic setup for using SageMaker.
* converting datasets to protobuf format used by the Amazon SageMaker algorithms and uploading to user provided S3 bucket.
* Training Sagemaker's Xgboost algorithm on the data set.
* Training Sagemaker's linear-learner on the data set.
* Training SageMaker's XGBoost algorithm on the data set.
* Training SageMaker's Linear Learner on the data set.
* Hosting the trained models.
* Scoring using the trained models.
* Combining predictions from the trained models in an ensemble.
* Combining predictions from the trained models in an ensemble.