Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Commit

Permalink
Merge pull request #2081 from microsoft/v1.4
Browse files Browse the repository at this point in the history
merge V1.4 back to master
  • Loading branch information
QuanluZhang authored Feb 19, 2020
2 parents aaaa275 + 8ff039c commit 24fa461
Show file tree
Hide file tree
Showing 40 changed files with 328 additions and 187 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ The tool manages automated machine learning (AutoML) experiments, **dispatches a
* Researchers and data scientists who want to easily **implement and experiement new AutoML algorithms**, may it be: hyperparameter tuning algorithm, neural architect search algorithm or model compression algorithm.
* ML Platform owners who want to **support AutoML in their platform**.

### **NNI v1.3 has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
### **NNI v1.4 has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**

## **NNI capabilities in a glance**
NNI provides CommandLine Tool as well as an user friendly WebUI to manage training experiements. With the extensible API, you can customize your own AutoML algorithms and training services. To make it easy for new users, NNI also provides a set of build-in stat-of-the-art AutoML algorithms and out of box support for popular training platforms.
Expand Down Expand Up @@ -233,7 +233,7 @@ The following example is built on TensorFlow 1.x. Make sure **TensorFlow 1.x is
* Download the examples via clone the source code.

```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
git clone -b v1.4 https://github.com/Microsoft/nni.git
```

* Run the MNIST example.
Expand Down
2 changes: 1 addition & 1 deletion docs/en_US/NAS/Proxylessnas.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,4 +60,4 @@ ProxylessNasMutator also implements the forward logic of the mutables (i.e., Lay

## Reproduce Results

Ongoing...
To reproduce the result, we first run the search, we found that though it runs many epochs the chosen architecture converges at the first several epochs. This is probably induced by hyper-parameters or the implementation, we are working on it. The test accuracy of the found architecture is top1: 72.31, top5: 90.26.
40 changes: 40 additions & 0 deletions docs/en_US/Release.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,45 @@
# ChangeLog

## Release 1.4 - 2/19/2020

### Major Features

#### Neural Architecture Search
* Support [C-DARTS](https://github.com/microsoft/nni/blob/v1.4/docs/en_US/NAS/CDARTS.md) algorithm and add [the example](https://github.com/microsoft/nni/tree/v1.4/examples/nas/cdarts) using it
* Support a preliminary version of [ProxylessNAS](https://github.com/microsoft/nni/blob/v1.4/docs/en_US/NAS/Proxylessnas.md) and the corresponding [example](https://github.com/microsoft/nni/tree/v1.4/examples/nas/proxylessnas)
* Add unit tests for the NAS framework

#### Model Compression
* Support DataParallel for compressing models, and provide [an example](https://github.com/microsoft/nni/blob/v1.4/examples/model_compress/multi_gpu.py) of using DataParallel
* Support [model speedup](https://github.com/microsoft/nni/blob/v1.4/docs/en_US/Compressor/ModelSpeedup.md) for compressed models, in Alpha version

#### Training Service
* Support complete PAI configurations by allowing users to specify PAI config file path
* Add example config yaml files for the new PAI mode (i.e., paiK8S)
* Support deleting experiments using sshkey in remote mode (thanks external contributor @tyusr)

#### WebUI
* WebUI refactor: adopt fabric framework

#### Others
* Support running [NNI experiment at foreground](https://github.com/microsoft/nni/blob/v1.4/docs/en_US/Tutorial/Nnictl.md#manage-an-experiment), i.e., `--foreground` argument in `nnictl create/resume/view`
* Support canceling the trials in UNKNOWN state
* Support large search space whose size could be up to 50mb (thanks external contributor @Sundrops)

### Documentation
* Improve [the index structure](https://nni.readthedocs.io/en/latest/) of NNI readthedocs
* Improve [documentation for NAS](https://github.com/microsoft/nni/blob/v1.4/docs/en_US/NAS/NasGuide.md)
* Improve documentation for [the new PAI mode](https://github.com/microsoft/nni/blob/v1.4/docs/en_US/TrainingService/PaiMode.md)
* Add QuickStart guidance for [NAS](https://github.com/microsoft/nni/blob/v1.4/docs/en_US/NAS/QuickStart.md) and [model compression](https://github.com/microsoft/nni/blob/v1.4/docs/en_US/Compressor/QuickStart.md)
* Improve documentation for [the supported EfficientNet](https://github.com/microsoft/nni/blob/v1.4/docs/en_US/TrialExample/EfficientNet.md)

### Bug Fixes
* Correctly support NaN in metric data, JSON compliant
* Fix the out-of-range bug of `randint` type in search space
* Fix the bug of wrong tensor device when exporting onnx model in model compression
* Fix incorrect handling of nnimanagerIP in the new PAI mode (i.e., paiK8S)


## Release 1.3 - 12/30/2019

### Major Features
Expand Down
4 changes: 2 additions & 2 deletions docs/en_US/Tutorial/InstallationLinux.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Installation on Linux and macOS follow the same instruction below.
Prerequisites: `python 64-bit >=3.5`, `git`, `wget`

```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
git clone -b v1.4 https://github.com/Microsoft/nni.git
cd nni
./install.sh
```
Expand All @@ -35,7 +35,7 @@ The following example is built on TensorFlow 1.x. Make sure **TensorFlow 1.x is
* Download the examples via clone the source code.

```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
git clone -b v1.4 https://github.com/Microsoft/nni.git
```

* Run the MNIST example.
Expand Down
4 changes: 2 additions & 2 deletions docs/en_US/Tutorial/InstallationWin.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Anaconda or Miniconda is highly recommended to manage multiple Python environmen
Prerequisites: `python 64-bit >=3.5`, `git`, `PowerShell`.

```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
git clone -b v1.4 https://github.com/Microsoft/nni.git
cd nni
powershell -ExecutionPolicy Bypass -file install.ps1
```
Expand All @@ -31,7 +31,7 @@ The following example is built on TensorFlow 1.x. Make sure **TensorFlow 1.x is
* Download the examples via clone the source code.

```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
git clone -b v1.4 https://github.com/Microsoft/nni.git
```

* Run the MNIST example.
Expand Down
2 changes: 1 addition & 1 deletion docs/en_US/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
# The short X.Y version
version = ''
# The full version, including alpha/beta/rc tags
release = 'v1.3'
release = 'v1.4'

# -- General configuration ---------------------------------------------------

Expand Down
8 changes: 4 additions & 4 deletions examples/model_compress/main_torch_pruner.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def test(model, device, test_loader):

def main():
torch.manual_seed(0)
device = torch.device('cpu')
device = torch.device('cuda')

trans = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
train_loader = torch.utils.data.DataLoader(
Expand All @@ -66,7 +66,7 @@ def main():
batch_size=1000, shuffle=True)

model = Mnist()
model.to(device)
model = model.to(device)

'''you can change this to LevelPruner to implement it
pruner = LevelPruner(configure_list)
Expand All @@ -82,14 +82,14 @@ def main():

pruner = AGP_Pruner(model, configure_list)
model = pruner.compress()

model = model.to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
for epoch in range(10):
pruner.update_epoch(epoch)
print('# Epoch {} #'.format(epoch))
train(model, device, train_loader, optimizer)
test(model, device, test_loader)
pruner.export_model('model.pth', 'mask.pth', 'model.onnx', [1, 1, 28, 28])
pruner.export_model('model.pth', 'mask.pth', 'model.onnx', [1, 1, 28, 28], device)


if __name__ == '__main__':
Expand Down
7 changes: 4 additions & 3 deletions examples/nas/proxylessnas/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
# configurations for search
parser.add_argument("--checkpoint_path", default='./search_mobile_net.pt', type=str)
parser.add_argument("--arch_path", default='./arch_path.pt', type=str)
parser.add_argument("--no-warmup", dest='warmup', action='store_false')
# configurations for retrain
parser.add_argument("--exported_arch_path", default=None, type=str)

Expand All @@ -54,7 +55,7 @@

# move network to GPU if available
if torch.cuda.is_available():
device = torch.device('cuda:0')
device = torch.device('cuda')
else:
device = torch.device('cpu')

Expand Down Expand Up @@ -86,7 +87,7 @@
train_loader=data_provider.train,
valid_loader=data_provider.valid,
device=device,
warmup=True,
warmup=args.warmup,
ckpt_path=args.checkpoint_path,
arch_path=args.arch_path)

Expand All @@ -102,4 +103,4 @@
"exported_arch_path {} should be a file.".format(args.exported_arch_path)
apply_fixed_architecture(model, args.exported_arch_path, device=device)
trainer = Retrain(model, optimizer, device, data_provider, n_epochs=300)
trainer.run()
trainer.run()
5 changes: 2 additions & 3 deletions src/nni_manager/core/nniDataStore.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
'use strict';

import * as assert from 'assert';
import * as JSON5 from 'json5';
import { Deferred } from 'ts-deferred';

import * as component from '../common/component';
Expand Down Expand Up @@ -132,7 +131,7 @@ class NNIDataStore implements DataStore {
}

public async storeMetricData(trialJobId: string, data: string): Promise<void> {
const metrics: MetricData = JSON5.parse(data);
const metrics: MetricData = JSON.parse(data);
// REQUEST_PARAMETER is used to request new parameters for multiphase trial job,
// it is not metrics, so it is skipped here.
if (metrics.type === 'REQUEST_PARAMETER') {
Expand All @@ -141,7 +140,7 @@ class NNIDataStore implements DataStore {
}
assert(trialJobId === metrics.trial_job_id);
try {
await this.db.storeMetricData(trialJobId, JSON5.stringify({
await this.db.storeMetricData(trialJobId, JSON.stringify({
trialJobId: metrics.trial_job_id,
parameterId: metrics.parameter_id,
type: metrics.type,
Expand Down
7 changes: 3 additions & 4 deletions src/nni_manager/core/sqlDatabase.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@

import * as assert from 'assert';
import * as fs from 'fs';
import * as JSON5 from 'json5';
import * as path from 'path';
import * as sqlite3 from 'sqlite3';
import { Deferred } from 'ts-deferred';
Expand Down Expand Up @@ -203,10 +202,10 @@ class SqlDB implements Database {

public storeMetricData(trialJobId: string, data: string): Promise<void> {
const sql: string = 'insert into MetricData values (?,?,?,?,?,?)';
const json: MetricDataRecord = JSON5.parse(data);
const args: any[] = [Date.now(), json.trialJobId, json.parameterId, json.type, json.sequence, JSON5.stringify(json.data)];
const json: MetricDataRecord = JSON.parse(data);
const args: any[] = [Date.now(), json.trialJobId, json.parameterId, json.type, json.sequence, JSON.stringify(json.data)];

this.log.trace(`storeMetricData: SQL: ${sql}, args: ${JSON5.stringify(args)}`);
this.log.trace(`storeMetricData: SQL: ${sql}, args: ${JSON.stringify(args)}`);
const deferred: Deferred<void> = new Deferred<void>();
this.db.run(sql, args, (err: Error | null) => { this.resolve(deferred, err); });

Expand Down
2 changes: 0 additions & 2 deletions src/nni_manager/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@
"express": "^4.16.3",
"express-joi-validator": "^2.0.0",
"js-base64": "^2.4.9",
"json5": "^2.1.1",
"kubernetes-client": "^6.5.0",
"rx": "^4.1.0",
"sqlite3": "^4.0.2",
Expand All @@ -36,7 +35,6 @@
"@types/express": "^4.16.0",
"@types/glob": "^7.1.1",
"@types/js-base64": "^2.3.1",
"@types/json5": "^0.0.30",
"@types/mocha": "^5.2.5",
"@types/node": "10.12.18",
"@types/request": "^2.47.1",
Expand Down
2 changes: 1 addition & 1 deletion src/nni_manager/rest_server/nniRestServer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ export class NNIRestServer extends RestServer {
*/
protected registerRestHandler(): void {
this.app.use(express.static('static'));
this.app.use(bodyParser.json());
this.app.use(bodyParser.json({limit: '50mb'}));
this.app.use(this.API_ROOT_URL, createRestHandler(this));
this.app.use(this.LOGS_ROOT_URL, express.static(getLogDir()));
this.app.get('*', (req: express.Request, res: express.Response) => {
Expand Down
10 changes: 0 additions & 10 deletions src/nni_manager/yarn.lock
Original file line number Diff line number Diff line change
Expand Up @@ -157,10 +157,6 @@
version "7.0.3"
resolved "https://registry.yarnpkg.com/@types/json-schema/-/json-schema-7.0.3.tgz#bdfd69d61e464dcc81b25159c270d75a73c1a636"

"@types/json5@^0.0.30":
version "0.0.30"
resolved "https://registry.yarnpkg.com/@types/json5/-/json5-0.0.30.tgz#44cb52f32a809734ca562e685c6473b5754a7818"

"@types/mime@*":
version "2.0.0"
resolved "https://registry.yarnpkg.com/@types/mime/-/mime-2.0.0.tgz#5a7306e367c539b9f6543499de8dd519fac37a8b"
Expand Down Expand Up @@ -2380,12 +2376,6 @@ json-stringify-safe@~5.0.1:
version "5.0.1"
resolved "https://registry.yarnpkg.com/json-stringify-safe/-/json-stringify-safe-5.0.1.tgz#1296a2d58fd45f19a0f6ce01d65701e2c735b6eb"

json5@^2.1.1:
version "2.1.1"
resolved "https://registry.yarnpkg.com/json5/-/json5-2.1.1.tgz#81b6cb04e9ba496f1c7005d07b4368a2638f90b6"
dependencies:
minimist "^1.2.0"

jsonparse@^1.2.0:
version "1.3.1"
resolved "https://registry.yarnpkg.com/jsonparse/-/jsonparse-1.3.1.tgz#3f4dae4a91fac315f71062f8521cc239f1366280"
Expand Down
5 changes: 4 additions & 1 deletion src/sdk/pynni/nni/bohb_advisor/bohb_advisor.py
Original file line number Diff line number Diff line change
Expand Up @@ -557,7 +557,8 @@ def handle_report_metric_data(self, data):
Data type not supported
"""
logger.debug('handle report metric data = %s', data)

if 'value' in data:
data['value'] = json_tricks.loads(data['value'])
if data['type'] == MetricType.REQUEST_PARAMETER:
assert multi_phase_enabled()
assert data['trial_job_id'] is not None
Expand Down Expand Up @@ -627,6 +628,8 @@ def handle_import_data(self, data):
AssertionError
data doesn't have required key 'parameter' and 'value'
"""
for entry in data:
entry['value'] = json_tricks.loads(entry['value'])
_completed_num = 0
for trial_info in data:
logger.info("Importing data, current processing progress %s / %s", _completed_num, len(data))
Expand Down
Empty file.
14 changes: 9 additions & 5 deletions src/sdk/pynni/nni/compression/speedup/torch/compress_modules.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

import logging
import torch
from .infer_shape import CoarseMask, ModuleMasks
from .infer_shape import ModuleMasks

_logger = logging.getLogger(__name__)

replace_module = {
'BatchNorm2d': lambda module, mask: replace_batchnorm2d(module, mask),
'Conv2d': lambda module, mask: replace_conv2d(module, mask),
'MaxPool2d': lambda module, mask: no_replace(module, mask),
'AvgPool2d': lambda module, mask: no_replace(module, mask),
'ReLU': lambda module, mask: no_replace(module, mask),
'Linear': lambda module, mask: replace_linear(module, mask)
}
Expand All @@ -16,6 +20,7 @@ def no_replace(module, mask):
"""
No need to replace
"""
_logger.debug("no need to replace")
return module

def replace_linear(linear, mask):
Expand All @@ -37,9 +42,8 @@ def replace_linear(linear, mask):
assert mask.output_mask is None
assert not mask.param_masks
index = mask.input_mask.mask_index[-1]
print(mask.input_mask.mask_index)
in_features = index.size()[0]
print('linear: ', in_features)
_logger.debug("replace linear with new in_features: %d", in_features)
new_linear = torch.nn.Linear(in_features=in_features,
out_features=linear.out_features,
bias=linear.bias is not None)
Expand Down Expand Up @@ -67,7 +71,7 @@ def replace_batchnorm2d(norm, mask):
assert 'weight' in mask.param_masks and 'bias' in mask.param_masks
index = mask.param_masks['weight'].mask_index[0]
num_features = index.size()[0]
print("replace batchnorm2d: ", num_features, index)
_logger.debug("replace batchnorm2d with num_features: %d", num_features)
new_norm = torch.nn.BatchNorm2d(num_features=num_features,
eps=norm.eps,
momentum=norm.momentum,
Expand Down Expand Up @@ -106,6 +110,7 @@ def replace_conv2d(conv, mask):
else:
out_channels_index = mask.output_mask.mask_index[1]
out_channels = out_channels_index.size()[0]
_logger.debug("replace conv2d with in_channels: %d, out_channels: %d", in_channels, out_channels)
new_conv = torch.nn.Conv2d(in_channels=in_channels,
out_channels=out_channels,
kernel_size=conv.kernel_size,
Expand All @@ -128,6 +133,5 @@ def replace_conv2d(conv, mask):
assert tmp_weight_data is not None, "Conv2d weight should be updated based on masks"
new_conv.weight.data.copy_(tmp_weight_data)
if conv.bias is not None:
print('final conv.bias is not None')
new_conv.bias.data.copy_(conv.bias.data if tmp_bias_data is None else tmp_bias_data)
return new_conv
Loading

0 comments on commit 24fa461

Please sign in to comment.