Merge pull request #235 from microsoft/master

merge master
SparkSnail · Mar 17, 2020 · 75028bd · 75028bd
2 parents 1d74ae5 + 2e42d1d
commit 75028bd
Show file tree

Hide file tree

Showing 94 changed files with 14,581 additions and 970 deletions.
diff --git a/Makefile b/Makefile
@@ -70,6 +70,8 @@ build:
 	cp -rf src/nni_manager/config src/nni_manager/dist/
 	#$(_INFO) Building WebUI $(_END)
 	cd src/webui && $(NNI_YARN) && $(NNI_YARN) build
+	#$(_INFO) Building NAS UI $(_END)
+	cd src/nasui && $(NNI_YARN) && $(NNI_YARN) build
 
 # All-in-one target for non-expert users
 # Installs NNI as well as its dependencies, and update bashrc to set PATH

diff --git a/README.md b/README.md
@@ -22,13 +22,13 @@ The tool manages automated machine learning (AutoML) experiments, **dispatches a
 
 * Those who want to **try different AutoML algorithms** in their training code/model.
 * Those who want to run AutoML trial jobs **in different environments** to speed up search.
-* Researchers and data scientists who want to easily **implement and experiement new AutoML algorithms**, may it be: hyperparameter tuning algorithm, neural architect search algorithm or model compression algorithm.
+* Researchers and data scientists who want to easily **implement and experiment new AutoML algorithms**, may it be: hyperparameter tuning algorithm, neural architect search algorithm or model compression algorithm.
 * ML Platform owners who want to **support AutoML in their platform**.
 
 ### **NNI v1.4 has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
 
 ## **NNI capabilities in a glance**
-NNI provides CommandLine Tool as well as an user friendly WebUI to manage training experiements. With the extensible API, you can customize your own AutoML algorithms and training services. To make it easy for new users, NNI also provides a set of build-in stat-of-the-art AutoML algorithms and out of box support for popular training platforms. 
+NNI provides CommandLine Tool as well as an user friendly WebUI to manage training experiments. With the extensible API, you can customize your own AutoML algorithms and training services. To make it easy for new users, NNI also provides a set of build-in stat-of-the-art AutoML algorithms and out of box support for popular training platforms. 
 
 Within the following table, we summarized the current NNI capabilities, we are gradually adding new capabilities and we'd love to have your contribution.
 

diff --git a/azure-pipelines.yml b/azure-pipelines.yml
@@ -77,7 +77,7 @@ jobs:
 
 - job: 'basic_test_pr_macOS'
   pool:
-    vmImage: 'macOS 10.13'
+    vmImage: 'macOS-10.15'
   strategy:
     matrix:
       Python36:
@@ -94,8 +94,8 @@ jobs:
       python3 -m pip install torch==1.2.0 --user
       python3 -m pip install torchvision==0.4.0 --user
       python3 -m pip install tensorflow==1.13.1 --user
-      ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" < /dev/null 2> /dev/null
       brew install swig@3
+      rm /usr/local/bin/swig
       ln -s /usr/local/opt/swig\@3/bin/swig /usr/local/bin/swig
       nnictl package install --name=SMAC
     displayName: 'Install dependencies'

diff --git a/deployment/deployment-pipelines.yml b/deployment/deployment-pipelines.yml
@@ -88,7 +88,7 @@ jobs:
         export IMG_NAME=$(dev_docker_img)
         export IMG_TAG=`git describe --tags --abbrev=0`.`date -u +%y%m%d%H%M`
         echo 'updating docker file for testpyi...'
-        sed -ie 's/RUN python3 -m pip --no-cache-dir install nni/RUN python3 -m pip install --user --no-cache-dir --index-url https:\/\/test.pypi.org\/simple --extra-index-url https:\/\/pypi.org\/simple nni/' Dockerfile
+        sed -ie 's/RUN python3 -m pip --no-cache-dir install nni/RUN python3 -m pip install --no-cache-dir --index-url https:\/\/test.pypi.org\/simple --extra-index-url https:\/\/pypi.org\/simple nni/' Dockerfile
       else
         docker login -u $(docker_hub_user) -p $(docker_hub_pwd)
         export IMG_NAME=msranni/nni

diff --git a/docs/en_US/AdvancedFeature/MultiPhase.md b/docs/en_US/AdvancedFeature/MultiPhase.md
@@ -1,3 +1,5 @@
+# Multi-phase
+
 ## What is multi-phase experiment
 
 Typically each trial job gets a single configuration (e.g., hyperparameters) from tuner, tries this configuration and reports result, then exits. But sometimes a trial job may wants to request multiple configurations from tuner. We find this is a very compelling feature. For example:

diff --git a/docs/en_US/Compressor/Framework.md b/docs/en_US/Compressor/Framework.md
@@ -0,0 +1,104 @@
+# Design Doc
+
+## Overview
+The model compression framework has two main components: `pruner` and `module wrapper`.
+
+### pruner
+A `pruner` is responsible for :
+1. provide a `cal_mask` method that calculates masks for weight and bias.
+2. replace the module with `module wrapper` based on config.
+3. modify the optimizer so that the `cal_mask` method is called every time the `step` method is called.
+
+### module wrapper
+A `module wrapper` is a module containing :
+1. the origin module
+2. some buffers used by `cal_mask`
+3. a new forward method that applies masks before running the original forward method.
+
+the reasons to use `module wrapper` :
+1. some buffers are needed by `cal_mask` to calculate masks and these buffers should be registered in `module wrapper` so that the original modules are not contaminated.
+2. a new `forward` method is needed to apply masks to weight before calling the real `forward` method.
+
+## How it works
+A basic pruner usage:
+```python
+configure_list = [{
+    'sparsity': 0.7,
+    'op_types': ['BatchNorm2d'],
+}]
+
+optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9, weight_decay=1e-4)
+pruner = SlimPruner(model, configure_list, optimizer)
+model = pruner.compress()
+```
+
+A pruner receive model, config and optimizer as arguments. In the `__init__` method, the `step` method of the optimizer is replaced with a new `step` method that calls `cal_mask`. Also, all modules are checked if they need to be pruned based on config. If a module needs to be pruned, then this module is replaced by a `module wrapper`. Afterward, the new model and new optimizer are returned, which can be trained as before. `compress` method will calculate the default masks.
+
+## Implement a new pruning algorithm
+Implementing a new pruning algorithm requires implementing a new `pruner` class, which should subclass `Pruner` and override the `cal_mask` method. The `cal_mask` is called by`optimizer.step` method.
+The `Pruner` base class provided basic functionality listed above, for example, replacing modules and patching optimizer.
+
+A basic pruner look likes this:
+```python
+class NewPruner(Pruner):
+    def __init__(self, model, config_list, optimizer)
+        super().__init__(model, config_list, optimizer)
+        # do some initialization
+
+    def calc_mask(self, wrapper, **kwargs):
+        # do something to calculate weight_mask
+        wrapper.weight_mask = weight_mask
+```
+### Set wrapper attribute
+Sometimes `cal_mask` must save some state data, therefore users can use `set_wrappers_attribute` API to register attribute just like how buffers are registered in PyTorch modules. These buffers will be registered to `module wrapper`. Users can access these buffers through `module wrapper`.
+
+```python
+class NewPruner(Pruner):
+    def __init__(self, model, config_list, optimizer):
+        super().__init__(model, config_list, optimizer)
+        self.set_wrappers_attribute("if_calculated", False)
+
+    def calc_mask(self, wrapper):
+        # do something to calculate weight_mask
+        if wrapper.if_calculated:
+            pass
+        else:
+            wrapper.if_calculated = True
+            # update masks
+```
+
+### Collect data during forward
+Sometimes users want to collect some data during the modules' forward method, for example, the mean value of the activation. Therefore user can add a customized collector to module.
+
+```python
+class ActivationRankFilterPruner(Pruner):
+    def __init__(self, model, config_list, optimizer, activation='relu', statistics_batch_num=1):
+        super().__init__(model, config_list, optimizer)
+        self.set_wrappers_attribute("if_calculated", False)
+        self.set_wrappers_attribute("collected_activation", [])
+        self.statistics_batch_num = statistics_batch_num
+
+        def collector(module_, input_, output):
+            if len(module_.collected_activation) < self.statistics_batch_num:
+                module_.collected_activation.append(self.activation(output.detach().cpu()))
+        self.add_activation_collector(collector)
+        assert activation in ['relu', 'relu6']
+        if activation == 'relu':
+            self.activation = torch.nn.functional.relu
+        elif activation == 'relu6':
+            self.activation = torch.nn.functional.relu6
+        else:
+            self.activation = None
+```
+The collector function will be called each time the forward method runs.
+
+Users can also remove this collector like this:
+```python
+collector_id = self.add_activation_collector(collector)
+# ...
+self.remove_activation_collector(collector_id)
+```
+
+### Multi-GPU support
+On multi-GPU training, buffers and parameters are copied to multiple GPU every time the `forward` method runs on multiple GPU. If buffers and parameters are updated in the `forward` method, an `in-place` update is needed to ensure the update is effective.
+Since `cal_mask` is called in the `optimizer.step` method, which happens after the `forward` method and happens only on one GPU, it supports multi-GPU naturally.
diff --git a/docs/en_US/Compressor/Overview.md b/docs/en_US/Compressor/Overview.md
@@ -3,7 +3,7 @@ As larger neural networks with more layers and nodes are considered, reducing th
 
 We are glad to introduce model compression toolkit on top of NNI, it's still in the experiment phase which might evolve based on usage feedback. We'd like to invite you to use, feedback and even contribute.
 
-NNI provides an easy-to-use toolkit to help user design and use compression algorithms. It currently supports PyTorch with unified interface. For users to compress their models, they only need to add several lines in their code. There are some popular model compression algorithms built-in in NNI. Users could further use NNI's auto tuning power to find the best compressed model, which is detailed in [Auto Model Compression](./AutoCompression.md). On the other hand, users could easily customize their new compression algorithms using NNI's interface, refer to the tutorial [here](#customize-new-compression-algorithms).
+NNI provides an easy-to-use toolkit to help user design and use compression algorithms. It currently supports PyTorch with unified interface. For users to compress their models, they only need to add several lines in their code. There are some popular model compression algorithms built-in in NNI. Users could further use NNI's auto tuning power to find the best compressed model, which is detailed in [Auto Model Compression](./AutoCompression.md). On the other hand, users could easily customize their new compression algorithms using NNI's interface, refer to the tutorial [here](#customize-new-compression-algorithms). Details about how model compression framework works can be found in [here](./Framework.md).
 
 For a survey of model compression, you can refer to this paper: [Recent Advances in Efficient Computation of Deep Convolutional Neural Networks](https://arxiv.org/pdf/1802.00939.pdf).
 

diff --git a/docs/en_US/Compressor/QuickStart.md b/docs/en_US/Compressor/QuickStart.md
@@ -1,6 +1,6 @@
 # Quick Start to Compress a Model
 
-NNI provides very simple APIs for compressing a model. The compression includes pruning algorithms and quantization algorithms. The usage of them are the same, thus, here we use slim pruner as an example to show the usage. The complete code of this example can be found [here](https://github.com/microsoft/nni/blob/master/examples/model_compress/slim_torch_cifar10.py).
+NNI provides very simple APIs for compressing a model. The compression includes pruning algorithms and quantization algorithms. The usage of them are the same, thus, here we use slim pruner as an example to show the usage.
 
 ## Write configuration
 
@@ -34,6 +34,8 @@ After training, you get accuracy of the pruned model. You can export model weigh
 pruner.export_model(model_path='pruned_vgg19_cifar10.pth', mask_path='mask_vgg19_cifar10.pth')
 ```
 
+The complete code of model compression examples can be found [here](https://github.com/microsoft/nni/blob/master/examples/model_compress/model_prune_torch.py).
+
 ## Speed up the model
 
 Masks do not provide real speedup of your model. The model should be speeded up based on the exported masks, thus, we provide an API to speed up your model as shown below. After invoking `apply_compression_results` on your model, your model becomes a smaller one with shorter inference latency.

diff --git a/docs/en_US/NAS/Advanced.md b/docs/en_US/NAS/Advanced.md
@@ -31,7 +31,7 @@ To demonstrate what mutators are for, we need to know how one-shot NAS normally
 
 Finally, mutators provide a method called `mutator.export()` that export a dict with architectures to the model. Note that currently this dict this a mapping from keys of mutables to tensors of selection. So in order to dump to json, users need to convert the tensors explicitly into python list.
 
-Meanwhile, NNI provides some useful tools so that users can implement trainers more easily. See [Trainers](./NasReference.md#trainers) for details.
+Meanwhile, NNI provides some useful tools so that users can implement trainers more easily. See [Trainers](./NasReference.md) for details.
 
 ## Implement New Mutators
 

diff --git a/docs/en_US/NAS/NasGuide.md b/docs/en_US/NAS/NasGuide.md
@@ -71,7 +71,7 @@ Input choice can be thought of as a callable module that receives a list of tens
 
 `LayerChoice` and `InputChoice` are both **mutables**. Mutable means "changeable". As opposed to traditional deep learning layers/modules which have fixed operation type once defined, models with mutables are essentially a series of possible models.
 
-Users can specify a **key** for each mutable. By default NNI will assign one for you that is globally unique, but in case users want to share choices (for example, there are two `LayerChoice` with the same candidate operations, and you want them to have the same choice, i.e., if first one chooses the i-th op, the second one also chooses the i-th op), they can give them the same key. The key marks the identity for this choice, and will be used in dumped checkpoint. So if you want to increase the readability of your exported architecture, manually assigning keys to each mutable would be a good idea. For advanced usage on mutables, see [Mutables](./NasReference.md#mutables).
+Users can specify a **key** for each mutable. By default NNI will assign one for you that is globally unique, but in case users want to share choices (for example, there are two `LayerChoice` with the same candidate operations, and you want them to have the same choice, i.e., if first one chooses the i-th op, the second one also chooses the i-th op), they can give them the same key. The key marks the identity for this choice, and will be used in dumped checkpoint. So if you want to increase the readability of your exported architecture, manually assigning keys to each mutable would be a good idea. For advanced usage on mutables, see [Mutables](./NasReference.md).
 
 ## Use a Search Algorithm
 
@@ -163,7 +163,7 @@ The JSON is simply a mapping from mutable keys to one-hot or multi-hot represent
 }
 ```
 
-After applying, the model is then fixed and ready for a final training. The model works as a single model, although it might contain more parameters than expected. This comes with pros and cons. The good side is, you can directly load the checkpoint dumped from supernet during search phase and start retrain from there. However, this is also a model with redundant parameters, which may cause problems when trying to count the number of parameters in model. For deeper reasons and possible workaround, see [Trainers](./NasReference.md#retrain).
+After applying, the model is then fixed and ready for a final training. The model works as a single model, although it might contain more parameters than expected. This comes with pros and cons. The good side is, you can directly load the checkpoint dumped from supernet during search phase and start retrain from there. However, this is also a model with redundant parameters, which may cause problems when trying to count the number of parameters in model. For deeper reasons and possible workaround, see [Trainers](./NasReference.md).
 
 Also refer to [DARTS](./DARTS.md) for example code of retraining.
 

diff --git a/docs/en_US/NAS/Overview.md b/docs/en_US/NAS/Overview.md
@@ -34,10 +34,10 @@ Here are some common dependencies to run the examples. PyTorch needs to be above
 
 |Name|Brief Introduction of Algorithm|
 |---|---|
-| [SPOS](SPOS.md) | [Single Path One-Shot Neural Architecture Search with Uniform Sampling](https://arxiv.org/abs/1904.00420) constructs a simplified supernet trained with an uniform path sampling method, and applies an evolutionary algorithm to efficiently search for the best-performing architectures. |
+| [SPOS's 2nd stage](SPOS.md) | [Single Path One-Shot Neural Architecture Search with Uniform Sampling](https://arxiv.org/abs/1904.00420) constructs a simplified supernet trained with an uniform path sampling method, and applies an evolutionary algorithm to efficiently search for the best-performing architectures. _Note:: SPOS is a two-stage algorithm, whose first stage is one-shot and second stage is distributed, leveraging result of first stage as a checkpoint._|
 
-```eval_rst
-.. Note:: SPOS is a two-stage algorithm, whose first stage is one-shot and second stage is distributed, leveraging result of first stage as a checkpoint.
+```eval_rst	
+.. Note:: SPOS is a two-stage algorithm, whose first stage is one-shot and second stage is distributed, leveraging result of first stage as a checkpoint.	
 ```
 
 ## Use NNI API
@@ -58,4 +58,4 @@ The programming interface of designing and searching a model is often demanded i
 [5]: https://arxiv.org/abs/1703.01041
 
 * To [report a bug](https://github.com/microsoft/nni/issues/new?template=bug-report.md) for this feature in GitHub;
-* To [file a feature or improvement request](https://github.com/microsoft/nni/issues/new?template=enhancement.md) for this feature in GitHub.
+* To [file a feature or improvement request](https://github.com/microsoft/nni/issues/new?template=enhancement.md) for this feature in GitHub.
diff --git a/docs/en_US/TrainingService/DLTSMode.md b/docs/en_US/TrainingService/DLTSMode.md
@@ -0,0 +1,49 @@
+**Run an Experiment on DLTS**
+===
+NNI supports running an experiment on [DLTS](https://github.com/microsoft/DLWorkspace.git), called dlts mode. Before starting to use NNI dlts mode, you should have an account to access DLTS dashboard.
+
+## Setup Environment
+
+Step 1. Choose a cluster from DLTS dashboard, ask administrator for the cluster dashboard URL.
+
+![Choose Cluster](../../img/dlts-step1.png)
+
+Step 2. Prepare a NNI config YAML like the following:
+
+```yaml
+# Set this field to "dlts"
+trainingServicePlatform: dlts
+authorName: your_name
+experimentName: auto_mnist
+trialConcurrency: 2
+maxExecDuration: 3h
+maxTrialNum: 100
+searchSpacePath: search_space.json
+useAnnotation: false
+tuner:
+  builtinTunerName: TPE
+  classArgs:
+    optimize_mode: maximize
+trial:
+  command: python3 mnist.py
+  codeDir: .
+  gpuNum: 1
+  image: msranni/nni
+# Configuration to access DLTS
+dltsConfig:
+  dashboard: # Ask administrator for the cluster dashboard URL
+```
+
+Remember to fill the cluster dashboard URL to the last line.
+
+Step 3. Open your working directory of the cluster, paste the NNI config as well as related code to a directory.
+
+![Copy Config](../../img/dlts-step3.png)
+
+Step 4. Submit a NNI manager job to the specified cluster.
+
+![Submit Job](../../img/dlts-step4.png)
+
+Step 5. Go to Endpoints tab of the newly created job, click the Port 40000 link to check trial's information.
+
+![View NNI WebUI](../../img/dlts-step5.png)
diff --git a/docs/en_US/hpo_advanced.rst b/docs/en_US/hpo_advanced.rst
@@ -2,6 +2,8 @@ Advanced Features
 =================
 
 ..  toctree::
+    :maxdepth: 2
+
     Enable Multi-phase <AdvancedFeature/MultiPhase>
     Write a New Tuner <Tuner/CustomizeTuner>
     Write a New Assessor <Assessor/CustomizeAssessor>

diff --git a/docs/en_US/model_compression.rst b/docs/en_US/model_compression.rst
@@ -21,3 +21,4 @@ For details, please refer to the following tutorials:
     Quantizers <quantizers>
     Model Speedup <Compressor/ModelSpeedup>
     Automatic Model Compression <Compressor/AutoCompression>
+    Implementation <Compressor/Framework>
diff --git a/docs/en_US/training_services.rst b/docs/en_US/training_services.rst
@@ -9,3 +9,4 @@ Introduction to NNI Training Services
     OpenPAI Yarn Mode<./TrainingService/PaiYarnMode>
     Kubeflow<./TrainingService/KubeflowMode>
     FrameworkController<./TrainingService/FrameworkControllerMode>
+    OpenPAI<./TrainingService/DLTSMode>
diff --git a/docs/img/dlts-step1.png b/docs/img/dlts-step1.png
diff --git a/docs/img/dlts-step3.png b/docs/img/dlts-step3.png
diff --git a/docs/img/dlts-step4.png b/docs/img/dlts-step4.png
diff --git a/docs/img/dlts-step5.png b/docs/img/dlts-step5.png