Merge branch 'zwei' into deprecate-ecr-prefix

aws · Jul 31, 2020 · 987171a · 987171a
2 parents cf61dba + 8b7be01
commit 987171a
Show file tree

Hide file tree

Showing 60 changed files with 940 additions and 553 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,79 @@
 # Changelog
 
+## v1.72.0 (2020-07-29)
+
+### Features
+
+ * Neo: Add Granular Target Description support for compilation
+
+### Documentation Changes
+
+ * Add xgboost doc on bring your own model
+ * fix typos on processing docs
+
+## v1.71.1 (2020-07-27)
+
+### Bug Fixes and Other Changes
+
+ * remove redundant information from the user_agent string.
+
+### Testing and Release Infrastructure
+
+ * use unique model name in TFS integ tests
+ * use pytest-cov instead of coverage
+
+## v1.71.0 (2020-07-23)
+
+### Features
+
+ * Add mpi support for mxnet estimator api
+
+### Bug Fixes and Other Changes
+
+ * use 'sagemaker' logger instead of root logger
+ * account for "py36" and "py37" in image tag parsing
+
+## v1.70.2 (2020-07-22)
+
+### Bug Fixes and Other Changes
+
+ * convert network_config in processing_config to dict
+
+### Documentation Changes
+
+ * Add ECR URI Estimator example
+
+## v1.70.1 (2020-07-21)
+
+### Bug Fixes and Other Changes
+
+ * Nullable fields in processing_config
+
+## v1.70.0 (2020-07-20)
+
+### Features
+
+ * Add model monitor support for us-gov-west-1
+ * support TFS 2.2
+
+### Bug Fixes and Other Changes
+
+ * reshape Artifacts into data frame in ExperimentsAnalytics
+
+### Documentation Changes
+
+ * fix MXNet version info for requirements.txt support
+
+## v1.69.0 (2020-07-09)
+
+### Features
+
+ * Add ModelClientConfig Fields for Batch Transform
+
+### Documentation Changes
+
+ * add KFP Processing component
+
 ## v2.0.0.rc1 (2020-07-08)
 
 ### Breaking Changes

diff --git a/buildspec-unittests.yml b/buildspec-unittests.yml
@@ -7,11 +7,11 @@ phases:
       - TOX_PARALLEL_NO_SPINNER=1
       - PY_COLORS=0
       - start_time=`date +%s`
-      - tox -e flake8,pylint,twine,black-check
+      - tox -e flake8,pylint,twine,black-check --parallel all
       - ./ci-scripts/displaytime.sh 'flake8,pylint,twine,black-check' $start_time
 
       - start_time=`date +%s`
-      - tox -e sphinx,doc8
+      - tox -e sphinx,doc8 --parallel all
       - ./ci-scripts/displaytime.sh 'sphinx,doc8' $start_time
 
       # run unit tests

diff --git a/doc/amazon_sagemaker_processing.rst b/doc/amazon_sagemaker_processing.rst
@@ -10,14 +10,14 @@ Amazon SageMaker Processing allows you to run steps for data pre- or post-proces
 Background
 ==========
 
-Amazon SageMaker lets developers and data scientists train and deploy machine learning models. With Amazon SageMaker Processing, you can run processing jobs on for data processing steps in your machine learning pipeline, which accept data from Amazon S3 as input, and put data into Amazon S3 as output.
+Amazon SageMaker lets developers and data scientists train and deploy machine learning models. With Amazon SageMaker Processing, you can run processing jobs for data processing steps in your machine learning pipeline. Processing jobs accept data from Amazon S3 as input and store data into Amazon S3 as output.
 
 .. image:: ./amazon_sagemaker_processing_image1.png
 
 Setup
 =====
 
-The fastest way to run get started with Amazon SageMaker Processing is by running a Jupyter notebook. You can follow the `Getting Started with Amazon SageMaker`_ guide to start running notebooks on Amazon SageMaker.
+The fastest way to get started with Amazon SageMaker Processing is by running a Jupyter notebook. You can follow the `Getting Started with Amazon SageMaker`_ guide to start running notebooks on Amazon SageMaker.
 
 .. _Getting Started with Amazon SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/gs.html
 

diff --git a/doc/frameworks/mxnet/using_mxnet.rst b/doc/frameworks/mxnet/using_mxnet.rst
@@ -159,13 +159,14 @@ If there are other packages you want to use with your script, you can include a
 Both ``requirements.txt`` and your training script should be put in the same folder.
 You must specify this folder in ``source_dir`` argument when creating an MXNet estimator.
 
-The function of installing packages using ``requirements.txt`` is supported for all MXNet versions during training.
+The function of installing packages using ``requirements.txt`` is supported for MXNet versions 1.3.0 and higher during training.
+
 When serving an MXNet model, support for this function varies with MXNet versions.
 For MXNet 1.6.0 or newer, ``requirements.txt`` must be under folder ``code``.
 The SageMaker MXNet Estimator automatically saves ``code`` in ``model.tar.gz`` after training (assuming you set up your script and ``requirements.txt`` correctly as stipulated in the previous paragraph).
 In the case of bringing your own trained model for deployment, you must save ``requirements.txt`` under folder ``code`` in ``model.tar.gz`` yourself or specify it through ``dependencies``.
-For MXNet 1.4.1, ``requirements.txt`` is not supported for inference.
-For MXNet 0.12.1-1.3.0, ``requirements.txt`` must be in ``source_dir``.
+For MXNet 0.12.1-1.2.1, 1.4.0-1.4.1, ``requirements.txt`` is not supported for inference.
+For MXNet 1.3.0, ``requirements.txt`` must be in ``source_dir``.
 
 A ``requirements.txt`` file is a text file that contains a list of items that are installed by using ``pip install``.
 You can also specify the version of an item to install.

diff --git a/doc/frameworks/tensorflow/using_tf.rst b/doc/frameworks/tensorflow/using_tf.rst
@@ -178,6 +178,28 @@ To use Python 3.7, please specify both of the args:
 Where the S3 url is a path to your training data within Amazon S3.
 The constructor keyword arguments define how SageMaker runs your training script.
 
+Specify a Docker image using an Estimator
+-----------------------------------------
+
+There are use cases, such as extending an existing pre-built Amazon SageMaker images, that require specifing a Docker image when creating an Estimator by directly specifying the ECR URI instead of the Python and framework version. For a full list of available container URIs, see `Available Deep Learning Containers Images <https://github.com/aws/deep-learning-containers/blob/master/available_images.md>`__ For more information on using Docker containers, see `Use Your Own Algorithms or Models with Amazon SageMaker <https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html>`__.
+
+When specifying the image, you must use the ``image_name=''`` arg to replace the following arg:
+
+- ``py_version=''``
+
+You should still specify the ``framework_version=''`` arg because the SageMaker Python SDK accomodates for differences in the images based on the version.
+
+The following example uses the ``image_name=''`` arg to specify the container image, Python version, and framework version.
+
+.. code:: python
+
+   tf_estimator = TensorFlow(entry_point='tf-train.py',
+                             role='SageMakerRole',
+                             train_instance_count=1,
+                             train_instance_type='ml.p2.xlarge',
+                             image_name='763104351884.dkr.ecr.<region>.amazonaws.com/<framework>-<job type>:<framework version>-<cpu/gpu>-<python version>-ubuntu18.04',
+                             script_mode=True)
+
 For more information about the sagemaker.tensorflow.TensorFlow estimator, see `SageMaker TensorFlow Classes`_.
 
 Call the fit Method

diff --git a/doc/frameworks/xgboost/using_xgboost.rst b/doc/frameworks/xgboost/using_xgboost.rst
@@ -390,6 +390,56 @@ The function should return a byte array of data serialized to ``content_type``.
 The default implementation expects ``prediction`` to be a NumPy array and can serialize the result to JSON, CSV, or NPY.
 It accepts response content types of "application/json", "text/csv", and "application/x-npy".
 
+Bring Your Own Model
+--------------------
+
+You can deploy an XGBoost model that you trained outside of SageMaker by using the Amazon SageMaker XGBoost container.
+Typically, you save an XGBoost model by pickling the ``Booster`` object or calling ``booster.save_model``.
+The XGBoost `built-in algorithm mode <https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html#xgboost-modes>`_
+supports both a pickled ``Booster`` object and a model produced by ``booster.save_model``.
+You can also deploy an XGBoost model by using XGBoost as a framework.
+By using XGBoost as a framework, you have more flexibility.
+To deploy an XGBoost model by using XGBoost as a framework, you need to:
+
+- Write an inference script.
+- Create the XGBoostModel object.
+
+Write an Inference Script
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You must create an inference script that implements (at least) the ``model_fn`` function that calls the loaded model to get a prediction.
+
+Optionally, you can also implement ``input_fn`` and ``output_fn`` to process input and output,
+and ``predict_fn`` to customize how the model server gets predictions from the loaded model.
+For information about how to write an inference script, see `SageMaker XGBoost Model Server <#sagemaker-xgboost-model-server>`_.
+Pass the filename of the inference script as the ``entry_point`` parameter when you create the `XGBoostModel` object.
+
+Create an XGBoostModel Object
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To create a model object, call the ``sagemaker.xgboost.model.XGBoostModel`` constructor,
+and then call its ``deploy()`` method to deploy your model for inference.
+
+.. code:: python
+
+    xgboost_model = XGBoostModel(
+        model_data="s3://my-bucket/my-path/model.tar.gz",
+        role="my-role",
+        entry_point="inference.py",
+        framework_version="1.0-1"
+    )
+
+    predictor = xgboost_model.deploy(
+        instance_type='ml.c4.xlarge',
+        initial_instance_count=1
+    )
+
+    # If payload is a string in LIBSVM format, we need to change serializer.
+    predictor.serializer = str
+    predictor.predict("<label> <index1>:<value1> <index2>:<value2>")
+
+To get predictions from your deployed model, you can call the ``predict()`` method.
+
 Host Multiple Models with Multi-Model Endpoints
 -----------------------------------------------
 
@@ -401,7 +451,6 @@ in the AWS documentation.
 For a sample notebook that uses Amazon SageMaker to deploy multiple XGBoost models to an endpoint, see the
 `Multi-Model Endpoint XGBoost Sample Notebook <https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/multi_model_xgboost_home_value/xgboost_multi_model_endpoint_home_value.ipynb>`_.
 
-
 *************************
 SageMaker XGBoost Classes
 *************************

diff --git a/doc/v2.rst b/doc/v2.rst
@@ -94,6 +94,18 @@ Please instantiate the objects instead.
 The ``update_endpoint`` argument in ``deploy()`` methods for estimators and models has been deprecated.
 Please use :func:`sagemaker.predictor.Predictor.update_endpoint` instead.
 
+``content_type`` and ``accept`` in the Predictor Constructor
+------------------------------------------------------------
+
+The ``content_type`` and ``accept`` parameters have been removed from the
+following classes and methods:
+- ``sagemaker.predictor.Predictor``
+- ``sagemaker.estimator.Estimator.create_model``
+- ``sagemaker.algorithms.AlgorithmEstimator.create_model``
+- ``sagemaker.tensorflow.model.TensorFlowPredictor``
+
+Please specify content types in a serializer or deserializer class instead.
+
 ``sagemaker.content_types``
 ---------------------------
 
@@ -115,7 +127,6 @@ write MIME types as a string,
 | ``CONTENT_TYPE_NPY``          | ``"application/x-npy"``        |
 +-------------------------------+--------------------------------+
 
-
 Require ``framework_version`` and ``py_version`` for Frameworks
 ===============================================================
 

diff --git a/doc/workflows/kubernetes/amazon_sagemaker_components_for_kubeflow_pipelines.rst b/doc/workflows/kubernetes/amazon_sagemaker_components_for_kubeflow_pipelines.rst
@@ -89,6 +89,10 @@ Pipelines workflow. For more information, see \ `SageMaker
 hyperparameter optimization Kubeflow Pipeline
 component <https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker/hyperparameter_tuning>`__.
 
+**Processing**
+
+The Processing component enables you to submit processing jobs to Amazon SageMaker directly from a Kubeflow Pipelines workflow. For more information, see \ `SageMaker Processing Kubeflow Pipeline component <https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker/process>`__.
+
 Inference components
 ^^^^^^^^^^^^^^^^^^^^
 

diff --git a/setup.py b/setup.py
@@ -33,7 +33,7 @@ def read_version():
 
 # Declare minimal set for installation
 required_packages = [
-    "boto3>=1.13.24",
+    "boto3>=1.14.12",
     "google-pasta",
     "numpy>=1.9.0",
     "protobuf>=3.1",

diff --git a/src/sagemaker/algorithm.py b/src/sagemaker/algorithm.py
@@ -16,7 +16,9 @@
 import sagemaker
 import sagemaker.parameter
 from sagemaker import vpc_utils
+from sagemaker.deserializers import BytesDeserializer
 from sagemaker.estimator import EstimatorBase
+from sagemaker.serializers import IdentitySerializer
 from sagemaker.transformer import Transformer
 from sagemaker.predictor import Predictor
 
@@ -251,37 +253,29 @@ def create_model(
         self,
         role=None,
         predictor_cls=None,
-        serializer=None,
-        deserializer=None,
-        content_type=None,
-        accept=None,
+        serializer=IdentitySerializer(),
+        deserializer=BytesDeserializer(),
         vpc_config_override=vpc_utils.VPC_CONFIG_DEFAULT,
         **kwargs
     ):
         """Create a model to deploy.
 
-        The serializer, deserializer, content_type, and accept arguments are
-        only used to define a default Predictor. They are ignored if an
-        explicit predictor class is passed in. Other arguments are passed
-        through to the Model class.
+        The serializer and deserializer are only used to define a default
+        Predictor. They are ignored if an explicit predictor class is passed in.
+        Other arguments are passed through to the Model class.
 
         Args:
             role (str): The ``ExecutionRoleArn`` IAM Role ARN for the ``Model``,
                 which is also used during transform jobs. If not specified, the
                 role from the Estimator will be used.
             predictor_cls (Predictor): The predictor class to use when
                 deploying the model.
-            serializer (callable): Should accept a single argument, the input
-                data, and return a sequence of bytes. May provide a content_type
-                attribute that defines the endpoint request content type
-            deserializer (callable): Should accept two arguments, the result
-                data and the response content type, and return a sequence of
-                bytes. May provide a content_type attribute that defines the
-                endpoint response Accept content type.
-            content_type (str): The invocation ContentType, overriding any
-                content_type from the serializer
-            accept (str): The invocation Accept, overriding any accept from the
-                deserializer.
+            serializer (:class:`~sagemaker.serializers.BaseSerializer`): A
+                serializer object, used to encode data for an inference endpoint
+                (default: :class:`~sagemaker.serializers.IdentitySerializer`).
+            deserializer (:class:`~sagemaker.deserializers.BaseDeserializer`): A
+                deserializer object, used to decode data from an inference
+                endpoint (default: :class:`~sagemaker.deserializers.BytesDeserializer`).
             vpc_config_override (dict[str, list[str]]): Optional override for VpcConfig set on
                 the model. Default: use subnets and security groups from this Estimator.
                 * 'Subnets' (list[str]): List of subnet ids.
@@ -300,7 +294,7 @@ def create_model(
         if predictor_cls is None:
 
             def predict_wrapper(endpoint, session):
-                return Predictor(endpoint, session, serializer, deserializer, content_type, accept)
+                return Predictor(endpoint, session, serializer, deserializer)
 
             predictor_cls = predict_wrapper