Merge pull request #83 from microsoft/master

pull code
chicm-ms · Apr 20, 2020 · b384ad2 · b384ad2
2 parents f9136c4 + bcb53a7
commit b384ad2
Show file tree

Hide file tree

Showing 38 changed files with 467 additions and 108 deletions.
diff --git a/README.md b/README.md
@@ -25,7 +25,7 @@ The tool manages automated machine learning (AutoML) experiments, **dispatches a
 * Researchers and data scientists who want to easily **implement and experiment new AutoML algorithms**, may it be: hyperparameter tuning algorithm, neural architect search algorithm or model compression algorithm.
 * ML Platform owners who want to **support AutoML in their platform**.
 
-### **NNI v1.4 has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
+### **NNI v1.5 has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
 
 ## **NNI capabilities in a glance**
 
@@ -238,7 +238,7 @@ The following example is built on TensorFlow 1.x. Make sure **TensorFlow 1.x is
 * Download the examples via clone the source code.
 
   ```bash
-  git clone -b v1.4 https://github.com/Microsoft/nni.git
+  git clone -b v1.5 https://github.com/Microsoft/nni.git
   ```
 
 * Run the MNIST example.

diff --git a/README_zh_CN.md b/README_zh_CN.md
@@ -19,7 +19,7 @@ NNI 管理自动机器学习 (AutoML) 的 Experiment，**调度运行**由调优
 * 想要更容易**实现或试验新的自动机器学习算法**的研究员或数据科学家，包括：超参调优算法，神经网络搜索算法以及模型压缩算法。
 * 在机器学习平台中**支持自动机器学习**。
 
-### **NNI v1.4 已发布！ &nbsp;[<img width="48" src="docs/img/release_icon.png" />](#nni-released-reminder)**
+### **NNI v1.5 已发布！ &nbsp;[<img width="48" src="docs/img/release_icon.png" />](#nni-released-reminder)**
 
 ## **NNI 功能一览**
 
@@ -102,6 +102,7 @@ NNI 提供命令行工具以及友好的 WebUI 来管理训练的 Experiment。
             <li><a href="docs/zh_CN/Tuner/BuiltinTuner.md#Evolution">Naïve Evolution（朴素进化）</a></li>
             <li><a href="docs/zh_CN/Tuner/BuiltinTuner.md#Anneal">Anneal（退火算法）</a></li>
             <li><a href="docs/zh_CN/Tuner/BuiltinTuner.md#Hyperband">Hyperband</a></li>
+            <li><a href="docs/zh_CN/Tuner/BuiltinTuner.md#PBTTuner">PBT</a></li>
           </ul>
           <b>贝叶斯优化</b>
             <ul>
@@ -125,7 +126,8 @@ NNI 提供命令行工具以及友好的 WebUI 来管理训练的 Experiment。
               <li><a href="docs/zh_CN/NAS/CDARTS.md">CDARTS</a></li>
               <li><a href="docs/zh_CN/NAS/SPOS.md">SPOS</a></li>
               <li><a href="docs/zh_CN/NAS/Proxylessnas.md">ProxylessNAS</a></li>
-              <li><a href="docs/zh_CN/Tuner/BuiltinTuner.md#NetworkMorphism">Network Morphism</a> </li>
+              <li><a href="docs/zh_CN/Tuner/BuiltinTuner.md#NetworkMorphism">Network Morphism</a></li>
+              <li><a href="docs/zh_CN/NAS/TextNAS.md">TextNAS</a></li>
             </ul>
           </ul>
           <a href="docs/zh_CN/Compressor/Overview.md">模型压缩</a>
@@ -230,7 +232,7 @@ Linux 和 macOS 下 NNI 系统需求[参考这里](https://nni.readthedocs.io/zh
 * 通过克隆源代码下载示例。
 
    ```bash
-   git clone -b v1.4 https://github.com/Microsoft/nni.git
+   git clone -b v1.5 https://github.com/Microsoft/nni.git
    ```
 
 * 运行 MNIST 示例。

diff --git a/deployment/docker/Dockerfile b/deployment/docker/Dockerfile
@@ -51,7 +51,7 @@ RUN python3 -m pip --no-cache-dir install Keras==2.1.6
 #
 # PyTorch
 #
-RUN python3 -m pip --no-cache-dir install torch==1.2.0
+RUN python3 -m pip --no-cache-dir install torch==1.4.0
 RUN python3 -m pip install torchvision==0.5.0
 
 #

diff --git a/docs/en_US/FeatureEngineering/Overview.md b/docs/en_US/FeatureEngineering/Overview.md
@@ -148,7 +148,7 @@ from sklearn.feature_selection.base import SelectorMixin
 
 from nni.feature_engineering.feature_selector import FeatureSelector
 
-class CustomizedSelector(FeatureSelector, BaseEstimator):
+class CustomizedSelector(FeatureSelector, BaseEstimator, SelectorMixin):
     def __init__(self, ...):
         ...
 
@@ -161,7 +161,7 @@ class CustomizedSelector(FeatureSelector, BaseEstimator):
         if not key.endswith('_')}
         return params
 
-        def set_params(self, **params):
+    def set_params(self, **params):
         """
         Set the parameters of this estimator.
         """

diff --git a/docs/en_US/Release.md b/docs/en_US/Release.md
@@ -1,5 +1,47 @@
 # ChangeLog
 
+## Release 1.5 - 4/13/2020
+
+### New Features and Documentation
+
+#### Hyper-Parameter Optimizing
+
+* New tuner: [Population Based Training (PBT)](https://github.com/microsoft/nni/blob/master/docs/en_US/Tuner/PBTTuner.md)
+* Trials can now report infinity and NaN as result
+
+#### Neural Architecture Search
+
+* New NAS algorithm: [TextNAS](https://github.com/microsoft/nni/blob/master/docs/en_US/NAS/TextNAS.md)
+* ENAS and DARTS now support [visualization](https://github.com/microsoft/nni/blob/master/docs/en_US/NAS/Visualization.md) through web UI.
+
+#### Model Compression
+
+* New Pruner: [GradientRankFilterPruner](https://github.com/microsoft/nni/blob/master/docs/en_US/Compressor/Pruner.md#gradientrankfilterpruner)
+* Compressors will validate configuration by default
+* Refactor: Adding optimizer as an input argument of pruner, for easy support of DataParallel and more efficient iterative pruning. This is a broken change for the usage of iterative pruning algorithms.
+* Model compression examples are refactored and improved
+* Added documentation for [implementing compressing algorithm](https://github.com/microsoft/nni/blob/master/docs/en_US/Compressor/Framework.md)
+
+#### Training Service
+
+* Kubeflow now supports pytorchjob crd v1 (thanks external contributor @jiapinai)
+* Experimental [DLTS](https://github.com/microsoft/nni/blob/master/docs/en_US/TrainingService/DLTSMode.md) support
+
+#### Overall Documentation Improvement
+
+* Documentation is significantly improved on grammar, spelling, and wording (thanks external contributor @AHartNtkn)
+
+### Fixed Bugs
+
+* ENAS cannot have more than one LSTM layers (thanks external contributor @marsggbo)
+* NNI manager's timers will never unsubscribe (thanks external contributor @guilhermehn)
+* NNI manager may exhaust head memory (thanks external contributor @Sundrops)
+* Batch tuner does not support customized trials (#2075)
+* Experiment cannot be killed if it failed on start (#2080)
+* Non-number type metrics break web UI (#2278)
+* A bug in lottery ticket pruner
+* Other minor glitches
+
 ## Release 1.4 - 2/19/2020
 
 ### Major Features

diff --git a/docs/en_US/TrialExample/GbdtExample.md b/docs/en_US/TrialExample/GbdtExample.md
@@ -147,8 +147,6 @@ if __name__ == '__main__':
 +   RECEIVED_PARAMS = nni.get_next_parameter()
     PARAMS = get_default_parameters()
 +   PARAMS.update(RECEIVED_PARAMS)
-    PARAMS = get_default_parameters()
-    PARAMS.update(RECEIVED_PARAMS)
 
     # train
     run(lgb_train, lgb_eval, PARAMS, X_test, y_test)
@@ -193,4 +191,4 @@ Run this experiment with command as follow:
 
 ```bash
 nnictl create --config ./config.yml
-```
+```
diff --git a/docs/en_US/Tutorial/InstallationLinux.md b/docs/en_US/Tutorial/InstallationLinux.md
@@ -19,7 +19,7 @@ Installation on Linux and macOS follow the same instructions, given below.
   Prerequisites: `python 64-bit >=3.5`, `git`, `wget`
 
   ```bash
-  git clone -b v1.4 https://github.com/Microsoft/nni.git
+  git clone -b v1.5 https://github.com/Microsoft/nni.git
   cd nni
   ./install.sh
   ```
@@ -35,7 +35,7 @@ The following example is built on TensorFlow 1.x. Make sure **TensorFlow 1.x is
 * Download the examples via cloning the source code.
 
   ```bash
-  git clone -b v1.4 https://github.com/Microsoft/nni.git
+  git clone -b v1.5 https://github.com/Microsoft/nni.git
   ```
 
 * Run the MNIST example.

diff --git a/docs/en_US/Tutorial/InstallationWin.md b/docs/en_US/Tutorial/InstallationWin.md
@@ -19,7 +19,7 @@ Anaconda or Miniconda is highly recommended to manage multiple Python environmen
   Prerequisites: `python 64-bit >=3.5`, `git`, `PowerShell`.
 
   ```bash
-  git clone -b v1.4 https://github.com/Microsoft/nni.git
+  git clone -b v1.5 https://github.com/Microsoft/nni.git
   cd nni
   powershell -ExecutionPolicy Bypass -file install.ps1
   ```
@@ -31,7 +31,7 @@ The following example is built on TensorFlow 1.x. Make sure **TensorFlow 1.x is
 * Download the examples via clone the source code.
 
   ```bash
-  git clone -b v1.4 https://github.com/Microsoft/nni.git
+  git clone -b v1.5 https://github.com/Microsoft/nni.git
   ```
 
 * Run the MNIST example.
@@ -136,4 +136,4 @@ Note:
 * [How to run an experiment on multiple machines?](../TrainingService/RemoteMachineMode.md)
 * [How to run an experiment on OpenPAI?](../TrainingService/PaiMode.md)
 * [How to run an experiment on Kubernetes through Kubeflow?](../TrainingService/KubeflowMode.md)
-* [How to run an experiment on Kubernetes through FrameworkController?](../TrainingService/FrameworkControllerMode.md)
+* [How to run an experiment on Kubernetes through FrameworkController?](../TrainingService/FrameworkControllerMode.md)
diff --git a/docs/en_US/conf.py b/docs/en_US/conf.py
@@ -28,7 +28,7 @@
 # The short X.Y version
 version = ''
 # The full version, including alpha/beta/rc tags
-release = 'v1.4'
+release = 'v1.5'
 
 # -- General configuration ---------------------------------------------------
 

diff --git a/docs/en_US/training_services.rst b/docs/en_US/training_services.rst
@@ -9,4 +9,4 @@ Introduction to NNI Training Services
     OpenPAI Yarn Mode<./TrainingService/PaiYarnMode>
     Kubeflow<./TrainingService/KubeflowMode>
     FrameworkController<./TrainingService/FrameworkControllerMode>
-    OpenPAI<./TrainingService/DLTSMode>
+    DLTS<./TrainingService/DLTSMode>
diff --git a/docs/img/nni_webui_joblist.jpg b/docs/img/nni_webui_joblist.jpg
diff --git a/docs/zh_CN/FeatureEngineering/Overview.md b/docs/zh_CN/FeatureEngineering/Overview.md
@@ -144,7 +144,7 @@ from sklearn.feature_selection.base import SelectorMixin
 
 from nni.feature_engineering.feature_selector import FeatureSelector
 
-class CustomizedSelector(FeatureSelector, BaseEstimator):
+class CustomizedSelector(FeatureSelector, BaseEstimator, SelectorMixin):
     def __init__(self, ...):
         ...
 
@@ -157,9 +157,9 @@ class CustomizedSelector(FeatureSelector, BaseEstimator):
         if not key.endswith('_')}
         return params
 
-        def set_params(self, **params):
+    def set_params(self, **params):
         """
-        设置参数
+        为此 estimator 设置参数
         """
         for param in params:
         if hasattr(self, param):

diff --git a/docs/zh_CN/NAS/NasGuide.md b/docs/zh_CN/NAS/NasGuide.md
@@ -118,7 +118,9 @@ trainer.export(file="model_dir/final_architecture.json")  # 将最终架构导
 
 用户可直接通过 `python3 train.py` 开始训练，不需要使用 `nnictl`。 训练完成后，可通过 `trainer.export()` 导出找到的最好的模型。
 
-通常，Trainer 会提供一些可以自定义的参数。 如，损失函数，指标函数，优化器以及数据集。 这些功能可满足大部分需求，NNI 会尽力让内置 Trainer 能够处理更多的模型、任务和数据集。 但无法保证全面的支持。 例如，一些 Trainer 假设必须是分类任务；一些 Trainer 对 "Epoch" 的定义有所不同（例如，ENAS 的 epoch 表示一部分子步骤加上一些 Controller 的步骤）；大多数 Trainer 不支持分布式训练，不会将模型通过 `DataParallel` 或 `DistributedDataParallel` 进行包装。 如果通过试用，想要在定制的应用中使用 Trainer，可能需要[自定义 Trainer](#extend-the-ability-of-one-shot-trainers)。
+通常，Trainer 会提供一些可以自定义的参数。 如，损失函数，指标函数，优化器以及数据集。 这些功能可满足大部分需求，NNI 会尽力让内置 Trainer 能够处理更多的模型、任务和数据集。 但无法保证全面的支持。 例如，一些 Trainer 假设必须是分类任务；一些 Trainer 对 "Epoch" 的定义有所不同（例如，ENAS 的 epoch 表示一部分子步骤加上一些 Controller 的步骤）；大多数 Trainer 不支持分布式训练，不会将模型通过 `DataParallel` 或 `DistributedDataParallel` 进行包装。 如果通过试用，想要在定制的应用中使用 Trainer，可能需要[自定义 Trainer](./Advanced.md#extend-the-ability-of-one-shot-trainers)。
+
+此外，可以使用 NAS 可视化来显示 One-Shot NAS。 [了解详情](./Visualization.md)。
 
 ### 分布式 NAS
 
@@ -146,14 +148,14 @@ nni.report_final_result(acc)  # 报告所选架构的性能
 
 ### 使用导出的架构重新训练
 
-搜索阶段后，就该训练找到的架构了。 与很多开源 NAS 算法不同，它们为重新训练专门写了新的模型。 我们发现搜索模型和重新训练模型的过程非常相似，因而可直接将一样的模型代码用到最终模型上。 例如：
+搜索阶段后，就该训练找到的架构了。 与很多开源 NAS 算法不同，它们为重新训练专门写了新的模型。 我们发现搜索模型和重新训练模型的过程非常相似，因而可直接将一样的模型代码用到最终模型上。 例如
 
 ```python
 model = Net()
 apply_fixed_architecture(model, "model_dir/final_architecture.json")
 ```
 
-JSON 文件是从 Mutable key 到 Choice 的表示。 例如：
+JSON 文件是从 Mutable key 到 Choice 的表示。 例如
 
 ```json
 {

diff --git a/docs/zh_CN/NAS/Overview.md b/docs/zh_CN/NAS/Overview.md
@@ -19,17 +19,19 @@ NNI 目前支持下面列出的 NAS 算法，并且正在添加更多算法。
 | [P-DARTS](PDARTS.md)            | [Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation](https://arxiv.org/abs/1904.12760) 基于DARTS。 它引入了一种有效的算法，可在搜索过程中逐渐增加搜索的深度。 |
 | [SPOS](SPOS.md)                 | 论文 [Single Path One-Shot Neural Architecture Search with Uniform Sampling](https://arxiv.org/abs/1904.00420) 构造了一个采用统一的路径采样方法来训练简化的超网络，并使用进化算法来提高搜索神经网络结构的效率。                   |
 | [CDARTS](CDARTS.md)             | [Cyclic Differentiable Architecture Search](https://arxiv.org/abs/****) 在搜索和评估的网络见构建了循环反馈的机制。 通过引入的循环的可微分架构搜索框架将两个网络集成为一个架构。                                                    |
-| [ProxylessNAS](Proxylessnas.md) | [ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware](https://arxiv.org/abs/1812.00332).                                                                |
+| [ProxylessNAS](Proxylessnas.md) | [ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware](https://arxiv.org/abs/1812.00332). 它删除了代理，直接从大规模目标任务和目标硬件平台进行学习。                                  |
+| [TextNAS](TextNAS.md)           | [TextNAS: A Neural Architecture Search Space tailored for Text Representation](https://arxiv.org/pdf/1912.10729.pdf)。 这是专门用于文本表示的神经网络架构搜索算法。                                    |
 
 One-shot 算法**不需要 nnictl，可单独运行**。 只实现了 PyTorch 版本。 将来的版本会支持 Tensorflow 2.x。
 
 这是运行示例的一些常见依赖项。 PyTorch 需要高于 1.2 才能使用 `BoolTensor`.
 
-* NNI 1.2+
 * tensorboard
 * PyTorch 1.2+
 * git
 
+一次性 NAS 可以通过可视化工具来查看。 点击[这里](./Visualization.md)，了解详情。
+
 ## 支持的分布式 NAS 算法
 
 | 名称                    | 算法简介                                                                                                                                                          |
@@ -49,6 +51,10 @@ One-shot 算法**不需要 nnictl，可单独运行**。 只实现了 PyTorch
 
 [这里](./NasGuide.md)是在 NNI 上开始使用 NAS 的用户指南。
 
+## NAS 可视化
+
+为了帮助用户跟踪指定搜索空间下搜索模型的过程和状态，开发了此可视化工具。 它将搜索空间可视化为超网络，并显示子网络、层和操作的重要性，同时还能显示重要性是如何在搜索过程中变化的。 参考 [NAS 可视化](./Visualization.md)文档了解详情。
+
 ## 参考和反馈
 
 * 在 GitHub 中[提交此功能的 Bug](https://github.com/microsoft/nni/issues/new?template=bug-report.md)；

diff --git a/docs/zh_CN/NAS/TextNAS.md b/docs/zh_CN/NAS/TextNAS.md
@@ -0,0 +1,80 @@
+# TextNAS
+
+## 介绍
+
+这是论文 [TextNAS: A Neural Architecture Search Space tailored for Text Representation](https://arxiv.org/pdf/1912.10729.pdf) 中 TextNAS 算法的实现。 TextNAS 是用于文本表示的神经网络架构搜索算法，具体来说，TextNAS 基于由适配各种自然语言任务的操作符所组成的新的搜索空间，TextNAS 还支持单个网络中的多路径集成，来平衡网络的宽度和深度。
+
+TextNAS 的搜索空间包含：
+
+    * 过滤器尺寸为 1, 3, 5, 7 的一维卷积操作
+    * 循环操作符（双向 GRU）
+    * 自注意操作符
+    * 池化操作符（最大值、平均值）
+
+遵循 ENAS 算法，TextNAS 也用了参数共享来加速搜索速度，并采用了强化学习的 Controller 来进行架构采样和生成。 参考 TextNAS 论文了解更多细节。
+
+## 准备
+
+准备词向量和 SST 数据集，并按如下结构放到 data 目录中：
+
+```
+textnas
+├── data
+│   ├── sst
+│   │   └── trees
+│   │       ├── dev.txt
+│   │       ├── test.txt
+│   │       └── train.txt
+│   └── glove.840B.300d.txt
+├── dataloader.py
+├── model.py
+├── ops.py
+├── README.md
+├── search.py
+└── utils.py
+```
+
+以下链接有助于查找和下载相应的数据集：
+
+* [GloVe: Global Vectors for Word Representation](https://nlp.stanford.edu/projects/glove/)
+  * [glove.840B.300d.txt](http://nlp.stanford.edu/data/glove.840B.300d.zip)
+* [Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank](https://nlp.stanford.edu/sentiment/)
+  * [trainDevTestTrees_PTB.zip](https://nlp.stanford.edu/sentiment/trainDevTestTrees_PTB.zip)
+
+## 示例
+
+### 搜索空间
+
+[示例代码](https://github.com/microsoft/nni/tree/master/examples/nas/textnas)
+
+```bash
+＃如果未克隆 NNI 代码。 如果代码已被克隆，请忽略此行并直接进入代码目录。
+git clone https://github.com/Microsoft/nni.git
+
+# 搜索最佳网络结构
+cd examples/nas/textnas
+
+# 查看搜索的更多选项
+python3 search.py -h
+```
+
+在每个搜索 Epoch 后，会直接测试 10 个采样的结构。 10 个 Epoch 后的性能预计为 40% - 42%。
+
+默认情况下，20 个采样结构会被导出到 `checkpoints` 目录中，以便进行下一步处理。
+
+### 重新训练
+
+```bash
+＃如果未克隆 NNI 代码。 如果代码已被克隆，请忽略此行并直接进入代码目录。
+git clone https://github.com/Microsoft/nni.git
+
+# 搜索最佳网络结构
+cd examples/nas/textnas
+
+# 默认在 sst-2 上训练
+sh run_retrain.sh
+```
+
+## 参考
+
+TextNAS 直接使用了 EnasTrainer，参考 [ENAS](./ENAS.md) 了解 Trainer 的 API。