Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Commit

Permalink
Merge 0.3 into master (#313)
Browse files Browse the repository at this point in the history
* Quick fix nnictl config logic (#289)

* fix nnictl bug

* fix install.sh

* add desc for Dockerfile.build.base

* update document for Dockerfile

* update

* refactor port detect

* update

* refactor NNICTLDOC.md

* add document for pai and nnictl

* add default value for port

* add exception handling in trial_keeper.py

* fix port bug

* fix resume

* fix nnictl resume and fix nnictl stop

* fix document

* update

* refactor nnictl

* update

* update doc

* update

* update nnictl

* fix comment

* revert dockerfile

* update

* update

* update

* fix nnictl error hit

* fix comments

* fix bash-completion

* fix paramiko install

* quick fix resume logic

* update

* quick fix nnictl

* PR merge to 0.3 (#297)

* refactor doc

* update with Mao's suggestions

* Set theme jekyll-theme-dinky

* update doc

* fix links

* fix links

* fix links

* merge

* fix links and doc errors

* merge

* merge

* merge

* merge

* Update README.md (#288)

added License badge

* merge

* updated the "Contribute" part (merged Gems' wiki in, updated ReadMe)

* fix link

* fix doc mistakes and broken links. (#271)

* refactor doc

* update with Mao's suggestions

* Set theme jekyll-theme-dinky

* updated the "Contribute" part (merged Gems' wiki in, updated ReadMe)

* fix link

* Update README.md

* Fix misspelling in examples/trials/ga_squad/README.md

* revise the installation cmd to v0.2

* revise to install v0.2

* remove enas readme (#292)

* Fix datastore performance issue (#301)

* Fix nnictl in v0.3 (#299)

Fix old version of config file
fix sklearn requirements
Fix resume log logic

* remove paramiko in V0.3 (#306)

remove paramiko in V0.3

* Release note 0.3 (#303)

* v0.3 release notes

* updates

* updates

* updates

* updates

* updates

* updates

* Inform users to set experiment id when id is empty  (#310)

* fix nnictl bug

* fix install.sh

* add desc for Dockerfile.build.base

* update document for Dockerfile

* update

* refactor port detect

* update

* refactor NNICTLDOC.md

* add document for pai and nnictl

* add default value for port

* add exception handling in trial_keeper.py

* fix port bug

* fix resume

* fix nnictl resume and fix nnictl stop

* fix document

* update

* refactor nnictl

* update

* update doc

* update

* update nnictl

* fix comment

* revert dockerfile

* update

* update

* update

* fix nnictl error hit

* fix comments

* fix bash-completion

* fix paramiko install

* quick fix resume logic

* update

* quick fix nnictl

* fix nnictl crash bug

* add requirement.txt for sklearn example

* fix nnictl configuration bug

* update

* update

* update

* update

* remove paramiko

* refactor nnictl lfor log stdout

* update

* updaate

* fix endtime when resume (#307)

* fix endtime when resume

* update

* update

* update

* updates
chicm-ms authored Nov 2, 2018

Verified

This commit was signed with the committer’s verified signature.
thaJeztah Sebastiaan van Stijn
1 parent bbf4760 commit 06710ab
Showing 8 changed files with 84 additions and 23 deletions.
34 changes: 34 additions & 0 deletions docs/RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,37 @@
# Release 0.3.0 - 11/2/2018
## Major Features
* Support running multiple experiments simultaneously. You can run multiple experiments by specifying a unique port for each experiment:

```nnictl create --port 8081 --config <config file path>```

You can still run the first experiment without '--port' parameter:

```nnictl create --config <config file path>```
* A builtin Batch Tuner which iterates all parameter combination, can be used to submit batch trial jobs.
* nni.report_final_result(result) API supports more data types for result parameter, it can be of following types:
* int
* float
* A python dict containing 'default' key, the value of 'default' key should be of type int or float. The dict can contain any other key value pairs.
* Continuous Integration
* Switched to Azure pipelines
* Others
* New nni.get_sequence_id() API. Each trial job is allocated a unique sequence number, which can be retrieved by nni.get_sequence_id() API.
* Download experiment result from WebUI
* Add trial examples using sklearn and NNI together
* Support updating max trial number
* Kaggle competition TGS Salt code as an example
* NNI Docker image:

```docker pull msranni/nni:latest```

## Breaking changes
* <span style="color:red">API nn.get_parameters() is renamed to nni.get_next_parameter(), this is a broken change, all examples of prior releases can not run on v0.3, please clone nni repo to get new examples.</span>

```git clone -b v0.3 https://github.com/Microsoft/nni.git```

## Know issues
[Known Issues in release 0.3.0](https://github.com/Microsoft/nni/labels/nni030knownissues).

# Release 0.2.0 - 9/29/2018
## Major Features
* Support [OpenPAI](https://github.com/Microsoft/pai) (aka pai) Training Service (See [here](./PAIMode.md) for instructions about how to submit NNI job in pai mode)
6 changes: 0 additions & 6 deletions examples/trials/enas/README.md

This file was deleted.

2 changes: 1 addition & 1 deletion examples/trials/sklearn/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
python3 -m pip install numpy
sudo apt-get install libblas-dev liblapack-dev libatlas-base-dev gfortran
sudo python3 -m pip install scipy
sudo python3 -m pip install sklearn
sudo python3 -m pip install sklearn
2 changes: 1 addition & 1 deletion src/nni_manager/common/datastore.ts
Original file line number Diff line number Diff line change
@@ -66,7 +66,7 @@ interface TrialJobInfo {
endTime?: number;
hyperParameters?: string[];
logPath?: string;
finalMetricData?: string;
finalMetricData?: MetricDataRecord;
stderrPath?: string;
}

22 changes: 14 additions & 8 deletions src/nni_manager/core/nniDataStore.ts
Original file line number Diff line number Diff line change
@@ -156,21 +156,23 @@ class NNIDataStore implements DataStore {
}

private async queryTrialJobs(status?: TrialJobStatus, trialJobId?: string): Promise<TrialJobInfo[]> {
const result: TrialJobInfo[]= [];
const result: TrialJobInfo[] = [];
const trialJobEvents: TrialJobEventRecord[] = await this.db.queryTrialJobEvent(trialJobId);
if (trialJobEvents === undefined) {
return result;
}
const map: Map<string, TrialJobInfo> = this.getTrialJobsByReplayEvents(trialJobEvents);

for (let key of map.keys()) {
const jobInfo = map.get(key);
const finalMetricsMap: Map<string, MetricDataRecord> = await this.getFinalMetricData(trialJobId);

for (const key of map.keys()) {
const jobInfo: TrialJobInfo | undefined = map.get(key);
if (jobInfo === undefined) {
continue;
}
if (!(status !== undefined && jobInfo.status !== status)) {
if (jobInfo.status === 'SUCCEEDED') {
jobInfo.finalMetricData = await this.getFinalMetricData(jobInfo.id);
jobInfo.finalMetricData = finalMetricsMap.get(jobInfo.id);
}
result.push(jobInfo);
}
@@ -179,16 +181,20 @@ class NNIDataStore implements DataStore {
return result;
}

private async getFinalMetricData(trialJobId: string): Promise<any> {
private async getFinalMetricData(trialJobId?: string): Promise<Map<string, MetricDataRecord>> {
const map: Map<string, MetricDataRecord> = new Map();
const metrics: MetricDataRecord[] = await this.getMetricData(trialJobId, 'FINAL');

const multiPhase: boolean = await this.isMultiPhase();

if (metrics.length > 1 && !multiPhase) {
this.log.error(`Found multiple FINAL results for trial job ${trialJobId}`);
for (const metric of metrics) {
if (map.has(metric.trialJobId) && !multiPhase) {
this.log.error(`Found multiple FINAL results for trial job ${trialJobId}`);
}
map.set(metric.trialJobId, metric);
}

return metrics[metrics.length - 1];
return map;
}

private async isMultiPhase(): Promise<boolean> {
5 changes: 5 additions & 0 deletions src/nni_manager/core/nnimanager.ts
Original file line number Diff line number Diff line change
@@ -175,6 +175,11 @@ class NNIManager implements Manager {
.filter((job: TrialJobInfo) => job.status === 'WAITING' || job.status === 'RUNNING')
.map((job: TrialJobInfo) => this.dataStore.storeTrialJobEvent('FAILED', job.id)));

if (this.experimentProfile.execDuration < this.experimentProfile.params.maxExecDuration &&
this.currSubmittedTrialNum < this.experimentProfile.params.maxTrialNum &&
this.experimentProfile.endTime) {
delete this.experimentProfile.endTime;
}
this.status.status = 'EXPERIMENT_RUNNING';

// TO DO: update database record for resume event
18 changes: 16 additions & 2 deletions src/nni_manager/rest_server/test/mockedNNIManager.ts
Original file line number Diff line number Diff line change
@@ -158,14 +158,28 @@ export class MockedNNIManager extends Manager {
status: 'SUCCEEDED',
startTime: Date.now(),
endTime: Date.now(),
finalMetricData: 'lr: 0.01, val accuracy: 0.89, batch size: 256'
finalMetricData: {
timestamp: 0,
trialJobId: '3456',
parameterId: '123',
type: 'FINAL',
sequence: 0,
data: '0.2'
}
};
const job2: TrialJobInfo = {
id: '3456',
status: 'FAILED',
startTime: Date.now(),
endTime: Date.now(),
finalMetricData: ''
finalMetricData: {
timestamp: 0,
trialJobId: '3456',
parameterId: '123',
type: 'FINAL',
sequence: 0,
data: '0.2'
}
};

return Promise.resolve([job1, job2]);
18 changes: 13 additions & 5 deletions tools/nnicmd/nnictl_utils.py
Original file line number Diff line number Diff line change
@@ -38,7 +38,7 @@ def check_experiment_id(args):
experiment_dict = experiment_config.get_all_experiments()
if not experiment_dict:
print_normal('There is no experiment running...')
exit(1)
return None
if not args.id:
running_experiment_list = []
for key in experiment_dict.keys():
@@ -58,14 +58,14 @@ def check_experiment_id(args):
exit(1)
elif not running_experiment_list:
print_error('There is no experiment running!')
exit(1)
return None
else:
return running_experiment_list[0]
if experiment_dict.get(args.id):
return args.id
else:
print_error('Id not correct!')
exit(1)
return None

def parse_ids(args):
'''Parse the arguments for nnictl stop
@@ -116,20 +116,28 @@ def parse_ids(args):
if len(result_list) > 1:
print_error(args.id + ' is ambiguous, please choose ' + ' '.join(result_list) )
return None
if not result_list:
print_error('There are no experiments matched, please check experiment id...')
if not result_list and args.id:
print_error('There are no experiments matched, please set correct experiment id...')
elif not result_list:
print_error('There is no experiment running...')
return result_list

def get_config_filename(args):
'''get the file name of config file'''
experiment_id = check_experiment_id(args)
if experiment_id is None:
print_error('Please set the experiment id!')
exit(1)
experiment_config = Experiments()
experiment_dict = experiment_config.get_all_experiments()
return experiment_dict[experiment_id]['fileName']

def get_experiment_port(args):
'''get the port of experiment'''
experiment_id = check_experiment_id(args)
if experiment_id is None:
print_error('Please set the experiment id!')
exit(1)
experiment_config = Experiments()
experiment_dict = experiment_config.get_all_experiments()
return experiment_dict[experiment_id]['port']

0 comments on commit 06710ab

Please sign in to comment.