Skip to content

Commit

Permalink
Merge branch 'zwei' into deprecate-ecr-prefix
Browse files Browse the repository at this point in the history
  • Loading branch information
laurenyu committed Jul 31, 2020
2 parents cf61dba + 8b7be01 commit 987171a
Show file tree
Hide file tree
Showing 60 changed files with 940 additions and 553 deletions.
74 changes: 74 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,79 @@
# Changelog

## v1.72.0 (2020-07-29)

### Features

* Neo: Add Granular Target Description support for compilation

### Documentation Changes

* Add xgboost doc on bring your own model
* fix typos on processing docs

## v1.71.1 (2020-07-27)

### Bug Fixes and Other Changes

* remove redundant information from the user_agent string.

### Testing and Release Infrastructure

* use unique model name in TFS integ tests
* use pytest-cov instead of coverage

## v1.71.0 (2020-07-23)

### Features

* Add mpi support for mxnet estimator api

### Bug Fixes and Other Changes

* use 'sagemaker' logger instead of root logger
* account for "py36" and "py37" in image tag parsing

## v1.70.2 (2020-07-22)

### Bug Fixes and Other Changes

* convert network_config in processing_config to dict

### Documentation Changes

* Add ECR URI Estimator example

## v1.70.1 (2020-07-21)

### Bug Fixes and Other Changes

* Nullable fields in processing_config

## v1.70.0 (2020-07-20)

### Features

* Add model monitor support for us-gov-west-1
* support TFS 2.2

### Bug Fixes and Other Changes

* reshape Artifacts into data frame in ExperimentsAnalytics

### Documentation Changes

* fix MXNet version info for requirements.txt support

## v1.69.0 (2020-07-09)

### Features

* Add ModelClientConfig Fields for Batch Transform

### Documentation Changes

* add KFP Processing component

## v2.0.0.rc1 (2020-07-08)

### Breaking Changes
Expand Down
4 changes: 2 additions & 2 deletions buildspec-unittests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ phases:
- TOX_PARALLEL_NO_SPINNER=1
- PY_COLORS=0
- start_time=`date +%s`
- tox -e flake8,pylint,twine,black-check
- tox -e flake8,pylint,twine,black-check --parallel all
- ./ci-scripts/displaytime.sh 'flake8,pylint,twine,black-check' $start_time

- start_time=`date +%s`
- tox -e sphinx,doc8
- tox -e sphinx,doc8 --parallel all
- ./ci-scripts/displaytime.sh 'sphinx,doc8' $start_time

# run unit tests
Expand Down
4 changes: 2 additions & 2 deletions doc/amazon_sagemaker_processing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,14 @@ Amazon SageMaker Processing allows you to run steps for data pre- or post-proces
Background
==========

Amazon SageMaker lets developers and data scientists train and deploy machine learning models. With Amazon SageMaker Processing, you can run processing jobs on for data processing steps in your machine learning pipeline, which accept data from Amazon S3 as input, and put data into Amazon S3 as output.
Amazon SageMaker lets developers and data scientists train and deploy machine learning models. With Amazon SageMaker Processing, you can run processing jobs for data processing steps in your machine learning pipeline. Processing jobs accept data from Amazon S3 as input and store data into Amazon S3 as output.

.. image:: ./amazon_sagemaker_processing_image1.png

Setup
=====

The fastest way to run get started with Amazon SageMaker Processing is by running a Jupyter notebook. You can follow the `Getting Started with Amazon SageMaker`_ guide to start running notebooks on Amazon SageMaker.
The fastest way to get started with Amazon SageMaker Processing is by running a Jupyter notebook. You can follow the `Getting Started with Amazon SageMaker`_ guide to start running notebooks on Amazon SageMaker.

.. _Getting Started with Amazon SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/gs.html

Expand Down
7 changes: 4 additions & 3 deletions doc/frameworks/mxnet/using_mxnet.rst
Original file line number Diff line number Diff line change
Expand Up @@ -159,13 +159,14 @@ If there are other packages you want to use with your script, you can include a
Both ``requirements.txt`` and your training script should be put in the same folder.
You must specify this folder in ``source_dir`` argument when creating an MXNet estimator.

The function of installing packages using ``requirements.txt`` is supported for all MXNet versions during training.
The function of installing packages using ``requirements.txt`` is supported for MXNet versions 1.3.0 and higher during training.

When serving an MXNet model, support for this function varies with MXNet versions.
For MXNet 1.6.0 or newer, ``requirements.txt`` must be under folder ``code``.
The SageMaker MXNet Estimator automatically saves ``code`` in ``model.tar.gz`` after training (assuming you set up your script and ``requirements.txt`` correctly as stipulated in the previous paragraph).
In the case of bringing your own trained model for deployment, you must save ``requirements.txt`` under folder ``code`` in ``model.tar.gz`` yourself or specify it through ``dependencies``.
For MXNet 1.4.1, ``requirements.txt`` is not supported for inference.
For MXNet 0.12.1-1.3.0, ``requirements.txt`` must be in ``source_dir``.
For MXNet 0.12.1-1.2.1, 1.4.0-1.4.1, ``requirements.txt`` is not supported for inference.
For MXNet 1.3.0, ``requirements.txt`` must be in ``source_dir``.

A ``requirements.txt`` file is a text file that contains a list of items that are installed by using ``pip install``.
You can also specify the version of an item to install.
Expand Down
22 changes: 22 additions & 0 deletions doc/frameworks/tensorflow/using_tf.rst
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,28 @@ To use Python 3.7, please specify both of the args:
Where the S3 url is a path to your training data within Amazon S3.
The constructor keyword arguments define how SageMaker runs your training script.

Specify a Docker image using an Estimator
-----------------------------------------

There are use cases, such as extending an existing pre-built Amazon SageMaker images, that require specifing a Docker image when creating an Estimator by directly specifying the ECR URI instead of the Python and framework version. For a full list of available container URIs, see `Available Deep Learning Containers Images <https://github.com/aws/deep-learning-containers/blob/master/available_images.md>`__ For more information on using Docker containers, see `Use Your Own Algorithms or Models with Amazon SageMaker <https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html>`__.

When specifying the image, you must use the ``image_name=''`` arg to replace the following arg:

- ``py_version=''``

You should still specify the ``framework_version=''`` arg because the SageMaker Python SDK accomodates for differences in the images based on the version.

The following example uses the ``image_name=''`` arg to specify the container image, Python version, and framework version.

.. code:: python
tf_estimator = TensorFlow(entry_point='tf-train.py',
role='SageMakerRole',
train_instance_count=1,
train_instance_type='ml.p2.xlarge',
image_name='763104351884.dkr.ecr.<region>.amazonaws.com/<framework>-<job type>:<framework version>-<cpu/gpu>-<python version>-ubuntu18.04',
script_mode=True)
For more information about the sagemaker.tensorflow.TensorFlow estimator, see `SageMaker TensorFlow Classes`_.

Call the fit Method
Expand Down
51 changes: 50 additions & 1 deletion doc/frameworks/xgboost/using_xgboost.rst
Original file line number Diff line number Diff line change
Expand Up @@ -390,6 +390,56 @@ The function should return a byte array of data serialized to ``content_type``.
The default implementation expects ``prediction`` to be a NumPy array and can serialize the result to JSON, CSV, or NPY.
It accepts response content types of "application/json", "text/csv", and "application/x-npy".

Bring Your Own Model
--------------------

You can deploy an XGBoost model that you trained outside of SageMaker by using the Amazon SageMaker XGBoost container.
Typically, you save an XGBoost model by pickling the ``Booster`` object or calling ``booster.save_model``.
The XGBoost `built-in algorithm mode <https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html#xgboost-modes>`_
supports both a pickled ``Booster`` object and a model produced by ``booster.save_model``.
You can also deploy an XGBoost model by using XGBoost as a framework.
By using XGBoost as a framework, you have more flexibility.
To deploy an XGBoost model by using XGBoost as a framework, you need to:

- Write an inference script.
- Create the XGBoostModel object.

Write an Inference Script
^^^^^^^^^^^^^^^^^^^^^^^^^

You must create an inference script that implements (at least) the ``model_fn`` function that calls the loaded model to get a prediction.

Optionally, you can also implement ``input_fn`` and ``output_fn`` to process input and output,
and ``predict_fn`` to customize how the model server gets predictions from the loaded model.
For information about how to write an inference script, see `SageMaker XGBoost Model Server <#sagemaker-xgboost-model-server>`_.
Pass the filename of the inference script as the ``entry_point`` parameter when you create the `XGBoostModel` object.

Create an XGBoostModel Object
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To create a model object, call the ``sagemaker.xgboost.model.XGBoostModel`` constructor,
and then call its ``deploy()`` method to deploy your model for inference.

.. code:: python
xgboost_model = XGBoostModel(
model_data="s3://my-bucket/my-path/model.tar.gz",
role="my-role",
entry_point="inference.py",
framework_version="1.0-1"
)
predictor = xgboost_model.deploy(
instance_type='ml.c4.xlarge',
initial_instance_count=1
)
# If payload is a string in LIBSVM format, we need to change serializer.
predictor.serializer = str
predictor.predict("<label> <index1>:<value1> <index2>:<value2>")
To get predictions from your deployed model, you can call the ``predict()`` method.

Host Multiple Models with Multi-Model Endpoints
-----------------------------------------------

Expand All @@ -401,7 +451,6 @@ in the AWS documentation.
For a sample notebook that uses Amazon SageMaker to deploy multiple XGBoost models to an endpoint, see the
`Multi-Model Endpoint XGBoost Sample Notebook <https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/multi_model_xgboost_home_value/xgboost_multi_model_endpoint_home_value.ipynb>`_.


*************************
SageMaker XGBoost Classes
*************************
Expand Down
13 changes: 12 additions & 1 deletion doc/v2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,18 @@ Please instantiate the objects instead.
The ``update_endpoint`` argument in ``deploy()`` methods for estimators and models has been deprecated.
Please use :func:`sagemaker.predictor.Predictor.update_endpoint` instead.

``content_type`` and ``accept`` in the Predictor Constructor
------------------------------------------------------------

The ``content_type`` and ``accept`` parameters have been removed from the
following classes and methods:
- ``sagemaker.predictor.Predictor``
- ``sagemaker.estimator.Estimator.create_model``
- ``sagemaker.algorithms.AlgorithmEstimator.create_model``
- ``sagemaker.tensorflow.model.TensorFlowPredictor``

Please specify content types in a serializer or deserializer class instead.

``sagemaker.content_types``
---------------------------

Expand All @@ -115,7 +127,6 @@ write MIME types as a string,
| ``CONTENT_TYPE_NPY`` | ``"application/x-npy"`` |
+-------------------------------+--------------------------------+


Require ``framework_version`` and ``py_version`` for Frameworks
===============================================================

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,10 @@ Pipelines workflow. For more information, see \ `SageMaker
hyperparameter optimization Kubeflow Pipeline
component <https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker/hyperparameter_tuning>`__.

**Processing**

The Processing component enables you to submit processing jobs to Amazon SageMaker directly from a Kubeflow Pipelines workflow. For more information, see \ `SageMaker Processing Kubeflow Pipeline component <https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker/process>`__.

Inference components
^^^^^^^^^^^^^^^^^^^^

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ def read_version():

# Declare minimal set for installation
required_packages = [
"boto3>=1.13.24",
"boto3>=1.14.12",
"google-pasta",
"numpy>=1.9.0",
"protobuf>=3.1",
Expand Down
34 changes: 14 additions & 20 deletions src/sagemaker/algorithm.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@
import sagemaker
import sagemaker.parameter
from sagemaker import vpc_utils
from sagemaker.deserializers import BytesDeserializer
from sagemaker.estimator import EstimatorBase
from sagemaker.serializers import IdentitySerializer
from sagemaker.transformer import Transformer
from sagemaker.predictor import Predictor

Expand Down Expand Up @@ -251,37 +253,29 @@ def create_model(
self,
role=None,
predictor_cls=None,
serializer=None,
deserializer=None,
content_type=None,
accept=None,
serializer=IdentitySerializer(),
deserializer=BytesDeserializer(),
vpc_config_override=vpc_utils.VPC_CONFIG_DEFAULT,
**kwargs
):
"""Create a model to deploy.
The serializer, deserializer, content_type, and accept arguments are
only used to define a default Predictor. They are ignored if an
explicit predictor class is passed in. Other arguments are passed
through to the Model class.
The serializer and deserializer are only used to define a default
Predictor. They are ignored if an explicit predictor class is passed in.
Other arguments are passed through to the Model class.
Args:
role (str): The ``ExecutionRoleArn`` IAM Role ARN for the ``Model``,
which is also used during transform jobs. If not specified, the
role from the Estimator will be used.
predictor_cls (Predictor): The predictor class to use when
deploying the model.
serializer (callable): Should accept a single argument, the input
data, and return a sequence of bytes. May provide a content_type
attribute that defines the endpoint request content type
deserializer (callable): Should accept two arguments, the result
data and the response content type, and return a sequence of
bytes. May provide a content_type attribute that defines the
endpoint response Accept content type.
content_type (str): The invocation ContentType, overriding any
content_type from the serializer
accept (str): The invocation Accept, overriding any accept from the
deserializer.
serializer (:class:`~sagemaker.serializers.BaseSerializer`): A
serializer object, used to encode data for an inference endpoint
(default: :class:`~sagemaker.serializers.IdentitySerializer`).
deserializer (:class:`~sagemaker.deserializers.BaseDeserializer`): A
deserializer object, used to decode data from an inference
endpoint (default: :class:`~sagemaker.deserializers.BytesDeserializer`).
vpc_config_override (dict[str, list[str]]): Optional override for VpcConfig set on
the model. Default: use subnets and security groups from this Estimator.
* 'Subnets' (list[str]): List of subnet ids.
Expand All @@ -300,7 +294,7 @@ def create_model(
if predictor_cls is None:

def predict_wrapper(endpoint, session):
return Predictor(endpoint, session, serializer, deserializer, content_type, accept)
return Predictor(endpoint, session, serializer, deserializer)

predictor_cls = predict_wrapper

Expand Down
Loading

0 comments on commit 987171a

Please sign in to comment.