Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Chec dev doc #606

Merged
merged 7 commits into from
Jan 17, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/GetStarted.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,3 +97,4 @@ The experiment has been running now, NNI provides WebUI for you to view experime
* [How to run an experiment on local (with multiple GPUs)?](tutorial_1_CR_exp_local_api.md)
* [How to run an experiment on multiple machines?](tutorial_2_RemoteMachineMode.md)
* [How to run an experiment on OpenPAI?](PAIMode.md)
* [How to create a multi-phase experiment](multiPhase.md)
43 changes: 43 additions & 0 deletions docs/multiPhase.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
## Create multi-phase experiment

Typically each trial job gets single set of configuration (e.g. hyper parameters) from tuner and do some kind of experiment, let's say train a model with that hyper parameter and reports its result to tuner. Sometimes you may want to train multiple models within one trial job to share information between models or saving system resource by creating less trial jobs, for example:
1. Train multiple models sequentially in one trial job, so that later models can leverage the weights or other information of prior models and may use different hyper parameters.
2. Train large amount of models on limited system resource, combine multiple models together to save system resource to create large amount of trial jobs.
3. Any other scenario that you would like to train multiple models with different hyper parameters in one trial job, be aware that if you allocate multiple GPUs to a trial job and you train multiple models concurrently within on trial job, you need to allocate GPU resource properly by your trial code.

In above cases, you can leverage NNI multi-phase experiment to train multiple models with different hyper parameters within each trial job.

Multi-phase experiments refer to experiments whose trial jobs request multiple hyper parameters from tuner and report multiple final results to NNI.

To use multi-phase experiment, please follow below steps:

1. Implement nni.multi_phase.MultiPhaseTuner. For example, this [ENAS tuner](https://github.com/countif/enas_nni/blob/master/nni/examples/tuners/enas/nni_controller_ptb.py) is a multi-phase Tuner which implements nni.multi_phase.MultiPhaseTuner. While implementing your MultiPhaseTuner, you may want to use the trial_job_id parameter of generate_parameters method to generate hyper parameters for each trial job.

2. Set ```multiPhase``` field to ```true```, and configure your tuner implemented in step 1 as customized tuner in configuration file, for example:

```yml
...
multiPhase: true
tuner:
codeDir: tuners/enas
classFileName: nni_controller_ptb.py
className: ENASTuner
classArgs:
say_hello: "hello"
...
```


3. Invoke nni.get_next_parameter() API for multiple times as needed in a trial, for example:

```python
for i in range(5):
# get parameter from tuner
tuner_param = nni.get_next_parameter()

# consume the params
# ...
# report final result somewhere for the parameter retrieved above
nni.report_final_result()
# ...
```
2 changes: 2 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ nav:
- Remote: RemoteMachineMode.md
- PAI: PAIMode.md
- Kubeflow: KubeflowMode.md
- Advanced Features:
- multiPhase.md
- Examples:
- MNIST Examples: mnist_examples.md
- Reference:
Expand Down