From 1b40fda6b59bdcbd2f02d0d14abfa66f4f98d4de Mon Sep 17 00:00:00 2001 From: quzha Date: Tue, 2 Jul 2019 19:34:19 +0800 Subject: [PATCH 1/6] update nas doc --- docs/en_US/GeneralNasInterfaces.md | 44 ++++++++++++++++++++++-------- 1 file changed, 32 insertions(+), 12 deletions(-) diff --git a/docs/en_US/GeneralNasInterfaces.md b/docs/en_US/GeneralNasInterfaces.md index 76600652f7..3d80959507 100644 --- a/docs/en_US/GeneralNasInterfaces.md +++ b/docs/en_US/GeneralNasInterfaces.md @@ -6,6 +6,8 @@ Automatic neural architecture search is taking an increasingly important role on To facilitate NAS innovations (e.g., design/implement new NAS models, compare different NAS models side-by-side), an easy-to-use and flexible programming interface is crucial. + + ## Programming interface A new programming interface for designing and searching for a model is often demanded in two scenarios. 1) When designing a neural network, the designer may have multiple choices for a layer, sub-model, or connection, and not sure which one or a combination performs the best. It would be appealing to have an easy way to express the candidate layers/sub-models they want to try. 2) For the researchers who are working on automatic NAS, they want to have an unified way to express the search space of neural architectures. And making unchanged trial code adapted to different searching algorithms. @@ -83,6 +85,36 @@ Accordingly, a specified neural architecture (generated by tuning algorithm) is With the specification of the format of search space and architecture (choice) expression, users are free to implement various (general) tuning algorithms for neural architecture search on NNI. One future work is to provide a general NAS algorithm. +## Support of One-Shot NAS + +One-Shot NAS is a popular approach to find good neural architecture within a limited time and resource budget. Basically, it builds a full graph based on the search space, and uses gradient descent to at last find the best subgraph. There are different training approaches, such as [training subgraphs (per mini-batch)][1], [training full graph through dropout][6], [training with architecture weights (regularization)][3]. + +NNI has supported the general NAS as demonstrated above. From users' point of view, One-Shot NAS and NAS have the same search space specification, thus, they could share the same programming interface as demonstrated above, just different training modes. NNI provides four training modes: + +**classic_mode**: this mode is described [above](#ProgInterface), in this mode, each subgraph runs as a subgraph. +**enas_mode**: (currently only supported on tensorflow) +**oneshot_mode**: (currently only supported on tensorflow) +**darts_mode**: (not supported yet) + +### enas_mode + +TODO + +With the same annotated trial code, users could choose One-Shot NAS as execution mode on NNI. Specifically, the compiled trial code builds the full graph (rather than subgraph demonstrated above), it receives a chosen architecture and training this architecture on the full graph for a mini-batch, then request another chosen architecture. It is supported by [NNI multi-phase](./multiPhase.md). We support this training approach because training a subgraph is very fast, building the graph every time training a subgraph induces too much overhead. + +![](../img/one-shot_training.png) + +The design of One-Shot NAS on NNI is shown in the above figure. One-Shot NAS usually only has one trial job with full graph. NNI supports running multiple such trial jobs each of which runs independently. As One-Shot NAS is not stable, running multiple instances helps find better model. Moreover, trial jobs are also able to synchronize weights during running (i.e., there is only one copy of weights, like asynchronous parameter-server mode). This may speedup converge. + +### oneshot_mode + +TODO + +### darts_mode + +TODO + + ============================================================= ## Neural architecture search on NNI @@ -107,18 +139,6 @@ We believe weight sharing (transferring) plays a key role on speeding up NAS, wh Example of weight sharing on NNI. -### [__TODO__] Support of One-Shot NAS - -One-Shot NAS is a popular approach to find good neural architecture within a limited time and resource budget. Basically, it builds a full graph based on the search space, and uses gradient descent to at last find the best subgraph. There are different training approaches, such as [training subgraphs (per mini-batch)][1], [training full graph through dropout][6], [training with architecture weights (regularization)][3]. Here we focus on the first approach, i.e., training subgraphs (ENAS). - -With the same annotated trial code, users could choose One-Shot NAS as execution mode on NNI. Specifically, the compiled trial code builds the full graph (rather than subgraph demonstrated above), it receives a chosen architecture and training this architecture on the full graph for a mini-batch, then request another chosen architecture. It is supported by [NNI multi-phase](./multiPhase.md). We support this training approach because training a subgraph is very fast, building the graph every time training a subgraph induces too much overhead. - -![](../img/one-shot_training.png) - -The design of One-Shot NAS on NNI is shown in the above figure. One-Shot NAS usually only has one trial job with full graph. NNI supports running multiple such trial jobs each of which runs independently. As One-Shot NAS is not stable, running multiple instances helps find better model. Moreover, trial jobs are also able to synchronize weights during running (i.e., there is only one copy of weights, like asynchronous parameter-server mode). This may speedup converge. - -Example of One-Shot NAS on NNI. - ## [__TODO__] General tuning algorithms for NAS From 20931ad05eb3d747ac21a5e2701ba653a4a966b7 Mon Sep 17 00:00:00 2001 From: quanlu Date: Thu, 4 Jul 2019 14:09:42 +0800 Subject: [PATCH 2/6] update doc of one-shot --- docs/en_US/GeneralNasInterfaces.md | 86 ++++++++++++++++++++++++----- docs/img/darts_mode.png | Bin 0 -> 13857 bytes docs/img/oneshot_mode.png | Bin 0 -> 14647 bytes 3 files changed, 73 insertions(+), 13 deletions(-) create mode 100644 docs/img/darts_mode.png create mode 100644 docs/img/oneshot_mode.png diff --git a/docs/en_US/GeneralNasInterfaces.md b/docs/en_US/GeneralNasInterfaces.md index 3d80959507..1cd3eb5ca9 100644 --- a/docs/en_US/GeneralNasInterfaces.md +++ b/docs/en_US/GeneralNasInterfaces.md @@ -91,33 +91,93 @@ One-Shot NAS is a popular approach to find good neural architecture within a lim NNI has supported the general NAS as demonstrated above. From users' point of view, One-Shot NAS and NAS have the same search space specification, thus, they could share the same programming interface as demonstrated above, just different training modes. NNI provides four training modes: -**classic_mode**: this mode is described [above](#ProgInterface), in this mode, each subgraph runs as a subgraph. -**enas_mode**: (currently only supported on tensorflow) -**oneshot_mode**: (currently only supported on tensorflow) -**darts_mode**: (not supported yet) +**classic_mode**: this mode is described [above](#ProgInterface), in this mode, each subgraph runs as a trial job. To use this mode, you should enable NNI annotation and specify a tuner for nas in experiment config file. [Here](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-nas) is an example to show how to write trial code and the config file. And [here](https://github.com/microsoft/nni/tree/master/examples/tuners/random_nas_tuner) is a simple tuner for nas. + +**enas_mode**: following the training approach in [enas paper][1]. It builds the full graph based on neural architrecture search space, and only activate one subgraph that generated by the controller for each mini-batch. [Detailed Description](#ENASMode). (currently only supported on tensorflow). + +To use enas_mode, you should add one more field in the `trial` config as shown below. +```diff +trial: + command: your command to run the trial + codeDir: the directory where the trial's code is located + gpuNum: the number of GPUs that one trial job needs ++ #choice: classic_mode, enas_mode, oneshot_mode ++ nasMode: enas_mode +``` +Similar to classic_mode, in enas_mode you need to specify a tuner for nas, as it also needs to receive subgraphs from tuner (or controller using the terminology in the paper). Since this trial job needs to receive multiple subgraphs from tuner, each one for a mini-batch, two lines need to be added to the trial code to receive the next subgraph and report the report the result of the current subgraph. Below is an example: +```python +for _ in range(num): + """@nni.get_next_parameter(self.session)""" + loss, _ = self.session.run([loss_op, train_op]) + """@nni.report_final_result(loss)""" +``` +Here, `get_next_parameter` needs an arg which is the session variable that you use to train your model. + +**oneshot_mode**: following the training approach in [this paper][6]. Different from enas_mode which trains the full graph by training large numbers of subgraphs, in oneshot_mode the full graph is built and dropout is added to candidate inputs and also added to candidate ops' outputs. Then this full graph is trained like other DL models. [Detailed Description](#OneshotMode). (currently only supported on tensorflow). + +To use oneshot_mode, you should add one more field in the `trial` config as shown below. In this mode, no need to specify tuner in the config file as it does not need tuner. (Note that you still need to specify a tuner (any tuner) in the config file in v0.9, not anymore in future releases.) +```diff +trial: + command: your command to run the trial + codeDir: the directory where the trial's code is located + gpuNum: the number of GPUs that one trial job needs ++ #choice: classic_mode, enas_mode, oneshot_mode ++ nasMode: oneshot_mode +``` -### enas_mode +**darts_mode**: following the training approach in [this paper][3]. It is similar to oneshot_mode. There are two differences, one is that darts_mode only add architecture weights to the outputs of candidate ops, the other is that it trains model weights and architecture weights in an interleaved manner. [Detailed Description](#DartsMode). (not supported yet). + +To use darts_mode, you should add one more field in the `trial` config as shown below. In this mode, also no need to specify tuner in the config file as it does not need tuner. +```diff +trial: + command: your command to run the trial + codeDir: the directory where the trial's code is located + gpuNum: the number of GPUs that one trial job needs ++ #choice: classic_mode, enas_mode, oneshot_mode ++ nasMode: darts_mode +``` -TODO +**Note:** for enas_mode, oneshot_mode, and darts_mode, NNI only works on the training phase. They also have their own inference phase which is not handled by NNI. For enas_mode, the inference phase is to generate new subgraphs through the controller. For oneshot_mode, the inference phase is sampling new subgraphs randomly and choosing good ones. For darts_mode, the inference phase is pruning a proportion of candidates ops based on architecture weights. -With the same annotated trial code, users could choose One-Shot NAS as execution mode on NNI. Specifically, the compiled trial code builds the full graph (rather than subgraph demonstrated above), it receives a chosen architecture and training this architecture on the full graph for a mini-batch, then request another chosen architecture. It is supported by [NNI multi-phase](./multiPhase.md). We support this training approach because training a subgraph is very fast, building the graph every time training a subgraph induces too much overhead. + -![](../img/one-shot_training.png) +### enas_mode + +In enas_mode, the compiled trial code builds the full graph (rather than subgraph), it receives a chosen architecture and training this architecture on the full graph for a mini-batch, then request another chosen architecture. It is supported by [NNI multi-phase](./multiPhase.md). + +Specifically, for trials using tensorflow, we create and use tensorflow variable as signals, and tensorflow conditional functions to control the search space (full-graph) to be more flexible, which means it can be changed into different sub-graphs (multiple times) depending on these signals. -The design of One-Shot NAS on NNI is shown in the above figure. One-Shot NAS usually only has one trial job with full graph. NNI supports running multiple such trial jobs each of which runs independently. As One-Shot NAS is not stable, running multiple instances helps find better model. Moreover, trial jobs are also able to synchronize weights during running (i.e., there is only one copy of weights, like asynchronous parameter-server mode). This may speedup converge. + ### oneshot_mode -TODO +Below is the figure to show where dropout is added to the full graph for one layer in `nni.mutable_layers`, input 1-k are candidate inputs, the four ops are candidate ops. + +![](../img/oneshot_mode.png) + +As suggested in the [paper][6], a dropout method is implemented to the inputs for every layer. The dropout rate is set to r^(1/k), where 0 < r < 1 is a hyper-parameter of the model (default to be 0.01) and k is number of optional inputs for a specific layer. The higher the fan-in, the more likely each possible input is to be dropped out. However, the probability of dropping out all optional_inputs of a layer is kept constant regardless of its fan-in. Suppose r = 0.05. If a layer has k = 2 optional_inputs then each one will independently be dropped out with probability 0.051/2 ≈ 0.22 and will be retained with probability 0.78. If a layer has k = 7 optional_inputs then each one will independently be dropped out with probability 0.051/7 ≈ 0.65 and will be retained with probability 0.35. In both cases, the probability of dropping out all of the layer's optional_inputs is 5%. The outputs of candidate ops are dropped out through the same way. + + ### darts_mode -TODO +Below is the figure to show where architecture weights are added to the full graph for one layer in `nni.mutable_layers`, output of each candidate op is multiplied by a weight which is called architecture weight. + +![](../img/darts_mode.png) + +More detailed description after this mode is supported. +### Multiple trial jobs for one-shot NAS + +One-Shot NAS usually has only one trial job with the full graph. However, running multiple such trial jobs leads to benefits. For example, in enas_mode multiple trial jobs could share the weights of the full graph to speedup the model training (or converge). Some one-shot approaches are not stable, running multiple trial jobs increase the possibility of finding better models. + +NNI natively supports running multiple such trial jobs. The figure below shows how multiple trial jobs run on NNI. + +![](../img/one-shot_training.png) ============================================================= -## Neural architecture search on NNI +## System design of NAS on NNI ### Basic flow of experiment execution @@ -127,7 +187,7 @@ NNI's annotation compiler transforms the annotated trial code to the code that c The above figure shows how the trial code runs on NNI. `nnictl` processes user trial code to generate a search space file and compiled trial code. The former is fed to tuner, and the latter is used to run trials. -[Simple example of NAS on NNI](https://github.com/microsoft/nni/tree/v0.8/examples/trials/mnist-nas). +[Simple example of NAS on NNI](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-nas). ### [__TODO__] Weight sharing diff --git a/docs/img/darts_mode.png b/docs/img/darts_mode.png new file mode 100644 index 0000000000000000000000000000000000000000..4917c4afbc642af9e8b62017cf8a3a28147c179b GIT binary patch literal 13857 zcmb`ucU06(vmiXAA?M5x1<4|D070Te0RaKYamWJ*NK(R(p zBuJ1XNOphty!YMRv-jS!`RTMxKql`a*2DZJtraTB#oj`bHg#)zlpQ{=HNF?32AB;Y? zA{YoHR#)q-A5gwv${8C6vY0|5}NuA@Rqu#I5C{?D`aSmQ~JIrJ2 zi^+iWH%7*r^Pa!Q>ZTSik5^Vu)3ZCfHNR`7UbZ=xxBhNYkpY814n!yn5a^y96a;#U z(+>b@L1+*NLBk3GfyOamSRjxw0uBPn5hfA?$er(O*S=Z2eC0?}(ZVtyL~FgW%EvA_ zaM!TWgBurB%}@AjG2wd#VfFLFw4|Kx6#S-=saR-3bw0)eYua^jo#PXM5&;^oYOzrG z__}cocYAbJrH_tRbx~|NJuFmUI!ozXbHcwhwj@@`3_ijNdC@e*Kj`*uW5-O-jDOqO z8H|2ARA;9XNGWEq|8Cn_ht@=-A8b}37JHOkbpDc7pEmZ>OqM)!%cI;Vb?MQ_baT-s z!-<|LB9#1V=aF|;oK9_v#S=ZJr>9Bm0R<>zpqZMeIkz zUOWcxdgPLdsqQ+2OOE)iA;lG3AA?^kEWeOxm^Ob^?rE|OLJ4stA_dB|nJkIW<5tug z3ak1P-4>5zU^Yw-%0etvTVjs(m)a|dulIIaG0emza>raa0K{o;VwjEAKeNhdp{v+x zZccw}V|h1?Mh@PG2ytoUY(ZD(#=SznozEu{ZxNxse4zdW_m!Kr=qA8(={sM1oxJ?P zV*!aZ8z^dr|_F*$Y%d`^58 z&eV6P@`<`TrlTw0~uXp^5B)qq}>X@3pR4fcO#@1T>JlV24KP8`=Qd#A$r@3qlFf(~z zQJMQbcXQUIJ!kZVjJVrl=AZX0_P*s76U(Ye*?Yc~p_B5r5DQ`E3o1`3H?FhFvdVP% z;a@Gt&$Q#l4DhlTRCfQok>{tDd9QZciLRF*OmVC0;S`?FGgIZp$a|lfg7}x}Pm&~J z%9Sj*6yDv%$?G~d>f;O@CcnQoFNM^MGMhshspE# zmU}MG;a?ZcqZDP)WS-?wO2P0Kj^R$%Q{rU_#cM`^jO{IPD-X4Ul8C=(L#;{sZlViJ zc2D(x-8;mP5`4P@{<|Sg0uXcC=x&QBWfAmv1cC2L{e&rb?7L?^Sr2ke0F6CCN)$ym zO7SfD808r)801ZO^FzF6LRCMx^Ziv`+iBOiY}@WMeMlRV3S-)#Vo#)|_Iaqcn7!OI zvDT*Naeb{3)c^H*rEB)F4cc(-QEU9}&L^?8EIw&xSA7^eD&dgO=n>?s$+Ypu?`pZB%h?S19w`FFzgYQ=v1L;Zd??@5DRv&|c-Y8JTE zt}p%aDx(!SFIz=a1ku)~&L94+%uC`!5rd6xYj(w-YBVMb27UEk_V|@TfqmNJTP+ZK zzY$|dRr#XkgOC&7?vzlkO-C%#q?@~cahOhJ16RSD?8lGq-pT?KO;<{N2eKdsH5S@# zVuMuCz--D4qTnyE&<;ICn-TC1<3!|#+AfUy-v1vg#(!bUxGnSkY$V7YukYW@^y4@6 zlTEhSsI@q+lhyD*88w~_s%i-*6rYq3b7i7$ji1dIlunGz$IA9Uohob?E)p+2#NqOi zivj!H;!7{NI4u2GAvki-2chtZj#ibiguG^YiYnwS2&6 zZjwI*`2Qi7m~K=0fB*J8-?sT@J(gKZz+@8AxOjlfem;CTTEh*fR7^p%u&ip`tx#Kl zz3bDmApI*5M$Oc7`I1|oA&=ZIcv(^4raqZ%0mac6cMb6NZ|t(!cRa?xX-m5E5vsBPk~oxPb=!+D;N);b!9Q* z3P4vr5R>&Q104S^jkESDfVaV}$B03!-a zFTv_p=E-0Q1r{2b_W`)>bO-`JlQF^y=HXPD}?Gt~G$m zeZ_+pfh`I!(-qD~+DM8Es69o9rOAOI`Pse(StvX;jJ9Vk1pZxxBQcW~! z&FIX46>`3!a>E8F0lwQ+9S*vmbU#uN6BxdD=1ho6Vln>}iVZ8^ke;5Qfw9ZBY~kVq zL(evcf!T$-V*{%Q08}Npp3X!Bm=%(VZaNIZ7vf-rgtL*248LCXvA8l1ZiO2Z=2VHDr!bR zyv)pnP9&%#TuGyfjb3mWK#a6KbPm8El`h7B07}s<(l{ayJ$ma-Bo18$)Z^%Yz2Fwm zAxj!b%nDhr_WGWPK(a#U$0fkB-QYwd?>Ib^MHG&?!jC2Wk}z8yT7rlyW(a}95Y)Ug z!2aM!YfO^CY~X;kQp_RxkA8(m5Qst1vNkYYn_Ewg2w#E@z%k4ZQ3NbcFk%L~MU=Np zTb?ozX*|xIU{UyhDrC0EwCy6JM^}OhRzTyMQG(y!TXb+7J`QXNO(WkTHq1u)yWu>c zrX19%^MQ+M2L{=6@7mWTgTILl{+*s#7HtiHN0>wt&SA)k7sh&dGukLw&oa>YG+E7`AX zMn)qZ8q|wmp$Z1}&#SX}6<8lQGH4JRP>Jmjpvl%jOd>l=`X5bsXoS+}ciG1ZCW**T z+TsQpJtQD`tUH~{BhR;h zJ!G0+oBfa?)7h);#U ze??kQ2ESp2qyZ!Wg?wmLf^iK<(bi|5XGgCT0IXDif8nBwXv0;fiYXG2G6xl4B0$sc zGti;5(h%@pD--7rjj8^kFn62bpts_(Qeas7@4Io?sq22A4|u=?=i7h@)s0UC(_z92yeE!nnfdNp1n{qjO1<=$ zZt$Bs*EW0;A|wapL@0buo-PKAxImqu*rMqi(SBIBDl3FiF01XMJlF7jE|6)hZY7abaY|aO5eiiHJ`YX$Kj%ch^a7X zZZK8;0x=}jvpRAW&I*DeFwq%wSHmS}8Q1uMvoO#APNenc2r0(?UaB9a&d@Q#qtdsD zEHImw4yN?&-G?{0EIsU+Fq@bcrDECkZ|M{1VK&KdvzGEit2w~>e;zUb20EHJGw_)< z9?6}IK78I!iV{*2?U)E#1Jt4=FXG`F zK69o3LnzOj-oNOg;h5Hv2cd%D8q^kVJZh4DB-<)RQnLy1tQyx`(wLW@t~T^yXTiAh z)WB4@=nRe+rZ#EnY~_M}WmOG=O0h|}aCEAgpk9rJ@LNFV)!r5N79YQg;Z*Za3iw&N zl$Q=~Ih}Q$o143M1!HkBX>&wW`QH!Pn-#kYgmi9dmu1i^yeFS`ZQ}H%0KMz!Xm4MKAJI(N`Ll~A%L#xm04Dy14EYA z{&Oiho1U(3u#=IAbHh)WZ_wMqcqj}?U_&9;Z+R5TJOB1$IK*f>|0OOD%DOYbpn(T4X9(&lbjT;b z*aprGM2#aW^LlUTYOMVVx%5QX-sYbU1dsc#??3miw#F1NLFx#2N2c>`?N!E%);x(3)sPDaoNIjAM3>8xL)*I_*z@TLLxe)DE+bAAQHjf29d^W6X9 zsTR)3c)?dJ(Hx-il=w&F24g-p2!@SDlq4C?B&U8}JG&`ppdiwes_oRJmLpFjE~0ge zpjz!p!<|i6aCl>t>HT;{3<~Lsd%f@HFczw<(4f^XD#p*dI z5d((BtMn~)ZPPW&4(~Mqo8;ko@v8*$dFFPn-I^e{fJF;tjfNKVzrqmoT*iRlE4yw$acg5IfLoCE74fC4Wpb0#z{h{B!XeOla^6F6l)yIg`5e z8tFb}2Z9x%P4S9T{Wv0YwVBGcyQR~77cZJR!y6Qbfx@sI59|1;2={$H^~kn6&Z;}C zyUHUIp00+Q`QnVf zPXIO0;RmY4K_galK2IOChg#xMPvJ_<2UmUIjh0sxXQxT1&%-g~><1G?a}va~dw*+I zq)NAgw_detFjYCKH*DR{M}4zfC$I#PU_^+n>z|8HM<}T3#KyL+lEbhSQD2}_b3m=7GEFyk_@prj*#j5V!y{K{0oZ}1C0>) z9IF#kb!~GE_)8CFwJ~OYmW4eceUbSfRZBx~tz#ZJDP-~x z*NgkXlb(xw%vQ6r>TlCtE3?|dyC6eADHucURBlibybdi5w;Gb~c8bVnXJNX?>zTb$ z3>;5t5Lje}72*iqOC&Z7Rmgu=zFM%bUZ0oGBrUUzu>bw%&yH`btgeJ(@&+#3L++#K z1=$83)iDz(&{Kd@#GNwCydHNGWOa0pYQA=48pDqr;V|QXFGZ^y?T)kgZb?h9PuzDS zb5k%Tuu#?7RAri6@ByVxt>8%mv)&W}m9-ZExov#!`&5=}6DA{mIjH8m@VdBT_Vyku zGv>JzwjD4C&y<6zo&Q~vmMWC$i!jkSalq@gX9{USf)#Fh*wf*BWszlv?s;>2iNKlf z^RMtscZ5nUoN^YynynYlH4NFtgS6K%QwZ@`n5`rP;#*~ULNJ)D!g0q>jB?0 zDyfI1PerJ!2GfdLGIQ&PjpOUy4eOH*VW9iN@IzmslyGA|65tBQp3#`xgO!2tV##S@ z*XZ)Gcxh-8NG`l{({Q9>-MvQ#Yek6GOCH&8>6mQ7gXdQ3>eE;Tsk^d}KZ0{V4rc{B zMl{VD$JQ}AD;Y4A&K6i-q>{Yr-dax=@|N{qQ3mwOX)Pt zm(oSY(_*t9e#CGF+`}g*F5xtL-BnTH{*~N-t1`=DMTbyN>V?@Ci`SI3*d#b;I`B@) zw$~K(U5Z)lGIzg>r_y3ClsM@=^`4uV#r=qkR_V{8{UrCAvd*N=+ze3O=~9ilMQsvG z7FYp0K`Jz_%GGX73{rM4Kl=xR*8oGFfD^1_N8gOEE8>^G&heCOiQi7L*fVcxosE`A zyxS-Qhm-m%*oy;4Gy4g*y(5zvyS7=|O~kKc!PcCeS{tmwJ2I-tIotz}V419t$bz%n=KZH;Tze@p87=~Ot7yzTevY?y z+u+n+wOKp?=Z(MqH0XX*c*y9CdhC1g_xX1vs<=GfzIgqcyVO<&wNk3fQ0^f0UE$St z<=&SMt%^%z88Dr$xNMS|pmuj4Q(cB>+X={2cbg#59owpTmUC^)VaSHQ5H7k(%JBm7 zbsX#D2{RiLHnHslzar+#xA}49(jt3BVU9x)bzGeESnoy$-d$;{LL2*_RNkdIjw-|& zhETZZN4w9tarxSDDey4tUv`_llvk3h+dmlCkK!!c3wa@@QTLqty5AK;oCCVw84T+A z8jk+_^U4z(i<9{tJ~Kj8_^12b7H8#GNTd=wWO0Qq=kcX%?`V0vuErCnh6r6Jar~z{ zV+$nlu#AldV8Bt78cdj^wYElC#rC}D5&{C6Z2WmThx*R1OLQ|sLfk5Wg&YuX(i{U} zitXU7Lpq1DZjll8U=J!QVU$%?=W}0<@8)mL>7#%D{~ zM`qsm_BYJW(2=^j#SMPQjg(HyPEogIr&T!*N{>EV2&RD z{4`fnv$NjM%kR0Ua?0Mkul4;-K}Kd$NuzEnF}vuzPmxOpv1|wFV&6AAB38@yI#(e>+bgw+h_8YK#Zjcsg__zDzYjI3b z8;(^kUdhbnRU0x~v8j2LJ31uBM@18S5cfxG11X0G`c!Yom`F9(h6l>Dkt!`Ro!3n| z#m}H&BR{{y=~R#Q&gmAcKGu&5QBhK4k<8>-m|+xv;#6bhGw111&6K*m%^! zv6=YoX3Fj6I}maPCTDhbrET7ojD(*)w)m4Y>zSN6($!_(m_+Gt zq!0XqmUD`(G*$ID`VT_t=SDOFGVBBR-)pDaPu)^-;4}h(KSk4!Qb9L26lInRYn#&i zgJUS&irz61Fxt0=VCrGuX!pA-e!g|5D`_Ah`t1L2q>0_lJ=u$aBzBd+i<(xB_JlN# zld8a*bBnW{hOs|_17(Bi=S_Iwwu=yGCp$t@;Te%9;sg6`AFfQpl2M0sZvI2rA@$-aMfd#= zi=Nfj+qOB&@frSJJ4eIV2c-Kayi}57X6$}!-bBFT*dWzfpHXbGq+5HsfMD;%>ZOI% z|D8AqnQ?eEzeFPdfxx-ZQN-vijeDqkyL|jeXqiB3w1a`wd37=3ok_A}6m{9<;gx?P z{IhFniVJ2WyjF~*W4n@E<&K}&cjaMjBOlNln_{nR0%S z?L*Ql<6)3=$5Ee&T>c=#?3BEFtGFKiFyVRTb5J9uHK%3D z@yAeCM$|>n^L4MhMd6XTM&uEAziw^r=ibcu1>C=u)8KY(N>G@)Vl~6VqK(y0EK>wi z0Z`e#hw7Y9jE9Y#-9HdHcpe=)l{NG_p1?|(}_w{39OCtETKbg~5$Y6ZC>F<HhJu?TpJHO>yR?534YV0xN|Sl}C>e-olvw>RQ> zwzf$h`hOy6+S=&QR7l?+kmUcD2@E%EJd5Mtsv;w#mGvn}M~(#@_vM`*{Z1Yx`Umm5 z=v7Xq%w-y3`&GgX>%g$x%Bo0mj-1xzhsr;yi(10R0D&gYMmW{?nIAj*K?gdj9S+Nz zME`!16OwEWum@uQLC^LaU$YN#~A(0!mCeShu*3k8U15Nz!#q0JqSat zV;l{IAj6p?7Q-JWDn<%H&i>vRR>(ou*o&I#CjXN|;2A^#vp1|!2&z8>c24z6%%h$F z04qRl+z235weNjmWxqhYMTQ4k-OUn2Bzz9MEX5L_iJ{aG#m z9xo%Y;}F+>L~c`_ZJ<#jQ=HR`J@1Tz#GWy^qN>+VhBrCNSI>&{YZp0PP=kiFn<5-P=>;4 zW(kQb0lsv18+NQR4VVoks`I5S^TZC|+}`aVjy1k-orC+6%@%;?Pj#@~A>}6~jmM;y zRZ9C8l&!HfDRQCp z{s=I6ZM1aGy6iPEHbV~OfC!%+IwJy(H&uLuP#_JL9vC@*7k%*{<$tzgzO(J+J*CqJWal(*cnzJ;=xpTwhr4_bulOs7P*Ti8&Fd}@uCda+kL}U@oL}unbCN*Z;}*@{_h4cL(eSN|+4`&c zt+P8E!nzvJo1yjxsHv`>QST8ZGfYWT&k38^lQCVbd>-j`+}n*;C1?y^AXmd^sFJ$F zvrOZy!htjY7kN%4=;U?&=f^iY;dze-Gs@FjPG6;JAd`h`Qm=lr-|VQ-*F^LYT{67& ziwJoYNmqUpDWJ;+xu;q({@+pkFIbqQq2g|GN_`@wsDxXxbt~ zq1de9Lwyr(k<|pz_r?1qtNiWRsY!rq+0@reA1>GwwO*V99cg1_@^U@*hGHAM;OZ5sg0LXpunKqI}6+Vm?Oqyl$*pDcBd>l zSo{0QOu4r6tmWtC9FA{Tzrg4NL$?_f{_#ski3e?X(MS||UPsuq|NFgMU@ZT*Ja%GEier z9VmxtQ_Xa?Rh+?>@HQC9tb*eaKk$9e%QfYwLn2y0MOz`g99_4nVFKlkaW92((FnaR zn_}1&n8*4cI956x`Qll{_2L*1&CZIrpvlm0ia~t_r8E*--@ErQ(Nw7(Rcm^^v&puKOm9uJFzED8B$SROi&=SSCoCnf%d@nMtH zrDNn1VZ#jG@Owy-yEFaDPwZBuh!q=Ujj+S@G^I9pnHiEYEVu5b#DzE{i@&vFhdj)T z-luA56gN+NmGCCpHzRQW=R^IDTm!En{2yUnY!CcK-of#;3WvT_Xyd7~b|%7!3<&|U z8*5TOk<$q43h>7EcVXhMcR4EV))hx}CDv+kak`9WC>Byq)sGrvU zQGZng9u8p!ZBLI^7k$Hvbm0Q;+Qp(G?GwwAEs*r~oSafES)g%+2 z1f`caZ5&PZaWb>zG7W?-lqOmVW)`uC(T)58)1DZ#Bj%9er})uctPpCeJs^Nrr)vE0 z^L#VB!8gQPx2TQdfKvar49fle5~;H7+c+L5;4`FDEeT`c;)?jCe3Drm+L;%kA)P`WM7F~Jv2gqDjH>dk~$0| z8x;a30R!(pF7aRuMI6aHE5E&bEaPARAbHIhQAiyeR^l=BWl>F}+tye!6Fa1MQDs(O%8h zn4=kN!0h2+hh)-Ux--$m+$5OILfFbQv|@+V^E_ut3|YpY6m=Qg`u!1;t0n*WRrFt6 z8vi-++_seX?R9o}H%1Eh*1q$%Wnimq^&W9Dh8-=~wcMfnzr08P zP2)eibSLL@{^2#N47hg{qW@nn>Hm!QOqDZ-Xe-{3hD!XfWeMBSd4)vGqb$6Aptj+w zHs69Pvc#_GgU~%a+cIvgB%FZ=Z4I6y_Lt)X`rm*=O}Y8evhIVU4@@~y&0%q$_VbZ!Xd zdQsj6Wjvr{zx?5}x5=n`y=pbuXfke<^%=(1as0^Zm?VNoQN6eaOviEbRi!k@_;{2){fZNaIE#sueR#>-zbcWMJLM!g*_dsTTGiX?r@RN0e zA;24_a#yg%*|ROtku70&R4(Nz$+t^H(q?!(&#q7?a{@|$MF++OLUcxo;zwPVR>X!g zgW=dz0ow5oBpq_qDIYCP7SVr;d+K%d>KPpwY>kpCTa8udvs!2ndksT|qT=ip8 z*NExphYmw8_v7TDQw4>Uo&2O51|hxWhlGp_-MoU1s*tbRxFxg^rjJ2r2{IG#ih&9- z$BrKGQh)A)tjoW-(`jktgwG1e*2Y~nP`UegM-ODfNhh$M16c*ouY!ID{I9=4$9D39>03_@VQBqLXWeY7aRWONZf~DxM+{7R~1y3 zpgqsZc`F?s=gaqm@M7B#c&(fwuSu1)%o~<$n1iA_!J$CxY`doL7Typx+5rhR(vXU= z3>iTpLsVNPk6e;FyC$sVp`SWx3-AaGV+;=2^%GNoAfb4~IXUip@=xF+izhw#6w+iy#8h=^P7Jbd1nm{P~&fpH2TJPNhY+_a4giR^J@?>Qw6+H9vB z?Ugf6Yq(1%Z9#||3@1uBd7vp{NXVQB>*|DD+AKE+krCY%2hq$Q(9z%sPsdSix`vz5 z;n5N;Mh646RmzW!16+NH`YZF!pE~v~N!DG;?*C~|#A!`pb~9^nA)QA3#04s-lvNhK zy85y*KlD4I4Z4e#@CyD@D@2rKbv8$b2NuNz9(p!zpoOL9=XphS2d7{4wfE7_)*+cxdOc`x_C(A_cY$5Se*!6k=jJM)EL)!s(#HHjhwzD-{F6jvvgzU!I{FdqJ^b``ba#P$um zqaz6JSM~zlJZXEwa+r(U3?tf&kQ6IPM3DTAp*wJ!xX*jkG~k+YxKg91gd_*R>Ql1 zk6P6C_t{|%sk}_at=Eg{(6)n~0+o11^i4q?$NA&>P}psiOG)Kur&%52;~wn-+*@g+ zJi>4Q^7PH&0#D_V!bF1z9+Rtdv4101n35T>UHd3ImtyC?2U$wK+f0y^n|8x_j)v<) z^`07N9(jABwbdq*wnfupMhmbJR`O8J1huHHK_Hetb98@2{-{95)+rG}&>C~32_2?N z;BRk&U!D52@pX>tDOx>1>yMWtBMk+INeC`@iy5lj5q7~}@n5l;oD^xi$8^F13Myz2 z91wyB^*Oc*?-DuufLw$Vj&yov@q6Lyq294SUH97pZXypOGV7@unfB`ZUBb!&^oA8$ zUN>0RAJ`hk4}a$AJ-W`gNKGDiHn(kbWu%rHAXt6&_}UnL=WdPvWsKs8t)do`Izf}! zpshrXW{d+itF2pCi1{^ormei{by9I0?Nj>KjA_*lu1n4-_Z4JgA1oh3;i=oI`kRC5j-Jq{-jBCMA)NuM@^N1 z?E`Q{j0qbD#zE@m52yObO*5G*uWJlCY+jnmEIRi!Xc%3Qi}zfgWdtUkN#FZHU`3PH zlF7o$!#Fepu0coG5qa3HT4jEg(gSs*%0ttmdYhPsIOx=8Gp65}QxH7U%!hU=ue(zj z#MhU%^jT-pdRtZbU&2F#*Q~!XyUZP?9C`cm!Es<0=5sq`X-J8_$va6%Lvry9$?=>r zHS=Bj%!yfV3=IaDO*9ZL7ID{K@NsIwx~+Y3*4Pg@QY(pKmC1Zqmya1A4kuh1oh5Ul2{N5aFRo%nomMAsN11$W zn~mqGd49jl@MyBsUV!lO^zitLy9VpMrG$1gv>)67mxC@-X7%@ipTd#drAp>0xGYDN z`mw~Eeo^)i^-no*PTogq8~H#DKgBKPOW=(K-jZxn;KdX6qD$o74_=R=^0Yfo;_U%i z@<7TA|DmlSr`mJr-HVXqFa9M~_1^1Fa78GH1gPG4vUMLqC=SC)2caBqA1v}RCLtIw?1wQ_>M+rw3D z-D5vQ{X%7#l$2lv{<3Ql@0VWthb<&t9e;MO(*2RlS6=E&b`|F-m3B5(511B+HRirh zvU?gVvwN1Po8aK{vGEZMmMndcbZWKkJ3}tb=UG(OBL)1~w>sY!J5KUk$!ep6W=0|G z!Ud3xV3D?Xh#BfGw40$A9U)*zHim7Z6C#8oq2UPx71aW5-IpPo0f3Alwcgn2$xF#D~lAVlrgV+!H#-`4-$$+!fZt7@MnB0dOS8(cg;}8Dq|@5^1I>gzO>FfH07nF#b|H# zHFU-VX5mi#Q60wtQ6(GrRMr|==6*{W0?FpoB)nYCGsrIb_o#J+{}plk>OHt zxuUxFZJoYWO;h}VD_8rk+1c5loQjHs&K|Nr#PkHg37>9;`YFiKs+_Q<*E=Kla=LllSiEW?y-f9H&UjpVSQE&^DnQ zNg;p4fwFBs%rlNW&4>(7bJD2(U`zk3)JA~ax_91yh}W28Q+^RtN*-}=u)mIx#tTW~ zm(_g}VHK$`V`mU4Ig^BF;aE}|hQ|9Ly$_fHp8Dm!260vO6Mk5(bqVZpLSzzvcG=f{O}MeK>=Ng|1PPw4FpcfazI8tzsG_t;q><5&IY2rfRG zOqN!h_9)Iv!@V#joAp=gY!QM?d_uy0uGZUWc>!Hdy!oN1`0_t&YDu{UBSEI$KTB?0 zk3uJ;$!bl<2{4BU^yS8>C$}XB_~h`RYOe2?sJZhz*ELDTYMF~wR_oM+{+Dj`wjMR; zdH?n&Dma!jwR|p9#N|B^-ic%~R3Y+@itFGgc`nmrop~JC6eB-Rx4fR{JpZSUiqGna zKpz{Nd)bD{=*Rc#^apcWOw$Yb`#usw2=94-DY-^F3F?&9k|b&d|Cp-dGlGcso7dJe z>vWQ>zgw$!U}OeiO{|*>Z1)_g76>gA|6%dzQ|sU?wKBvsFT{mwLG(EX*(-*srD55{ z#1X>x!~P$CwW$mAv87x7)n#wY@=1>*9*h zD09@Nd=bVHrA~sWE$T5xiL`)c!b~nag{I>0xtLxX6%%oNGG|&Gx z8vUEdCH2?~YwLRw%I{OdnNJDp#c1Z2T6*oT`ab`p74|pu*=#Rcoz1>5J+sr);}}Rm zQPUvs>7vNF#RK0R%|wA?hoedJM8N>3y|&X;k(sALmJDN9f;Z?HpMw??3NnCvd;Y~7Y|lM`1Xvx4g+3RxI^LiXfsq41;*=A zk16E)>4Yb&Z7l#Hb$c>cF$Aa~MM)1uMZL9Z*n`fPMv?;6yy7P4FkTPLkiH+J!EpTL zDv$ZIF#fujza2CYTD%1mayZ4__0g%k6Wc-Oc8VrX@jM#_K@|YGmqbk6aXU>@e|06; zXq;~aGrwv1Oh{War6gp@495#JCNB=(Js1qXLt1fo$>H=6@UTmbU+E<_@-^e zw9h}l%-5xzrf*`r9Fju9X)v|-efK@HMidN^G_^wXyq}h2Q$*8PQBkuD{0h&Iee$RH zX_(ME;#)9B1&`U2-5~j9OpV)xvDM15;U&esK$0txj)^;%S&2s}~VAPkl z1N8np=D$S^gpUK?lGED7H{EvuSUZ+hUNELI9b2Ht1MS(`qY%??3*j5-cyC}Ln@#EI zr7PSizkjR7Cj$k9G7WQ+Rg3oCIOvl^fjYot(z#Jc54R;RemGvn`tZ<77Z)}jiK-%E z+*ftJ;u>C@$}%5m{uAEC)F-*s3WDF6upYaIJ)Yb1HaS|GyY4jY!&p4qhw@^lGic>|9B z(VvcP>V3Eoi7-jn4~4AR;XMyF0@YJJR&waxX@(l{_oPUAp1cd@{%Pyqc`fWYEOR+4 z)kn;P%mbqQq7KTb6qW^;rUVmw98iQk`2n`L27OsGLTSXnacjy|_cAD5LstxaO6 z(NRf7Ie7H-S>(qSFf&un9YCQFgbC$`aJ7^)1RQy@P&UM=iABFgFZ4Ow>0~&jofv@} zWCbMO!qn0mvtf&uMW}QG4a3W*z`c@pi+FW1ggdYU-W0bh(W+5y#jh0Mc$?QT@%HA! z*h7cTTy#ryAG?U+Vny{b$=|Mm$tdx>RLwD+a`}PiZF`dh81L}2rmG8xn+gUOsHU!zU5+_6y_^IL(hu;`2iO-dD|ZPr&)rH{ z%A;b)@6tZX67+EpmMt&7ua|8C>t~ow1nf#YFwYh-ndTuixXV_MHEG$DnSW z2ZD~6qYj8GgC_6QpJ54M^{gIZrYe4b&7AJ3IU#*%zF))C>UMMSY>{?WdX^DVE)=5(5LgM^-1v(jTm^0}H`JCP~vU#PogphU2-i`h8dh z5Xj?Fn9h{eI{C= zYv6{%PVvNzW9iP`Ai#nOcG%2N>eqV|j4B#&hJ6Ri$l$!WBAw(L-q&LecJmf3;bYhr z{%?23mrsQD$jpmKVvy!|LLy9hs)Mvu3_l^p%Yt+GnNXOyu%xR+J>zN82(Ypg>?{I8 z$>H^l{2|!oYD)b25>rn@Q#A;VUzVxZo80l?7g6s`j<@&oVu(3r{dLU(rG8uR#m?Z@ z<;?_(B<dC-xh66m> zpp<6G(3DJU_dHn-Tya>otg{ptqd=GX`&VVyhg%9ZXf95+1A7+wVl;<(uxwoT#@X<3 zq^0H{R_&)Mu!a+UzYRE}VD9+j!)pB>;8dcd>8xJHij(`anA%rb#Sp?oualFQ&+^`v z>|C0_spLu13$_TR!(z0dLJ_!q{^_I~w?b{?R__iqrz^`q?uV%f$0 zDj=wN^_aaaR}%BrJC0s90iRFe)KkTrS7z0HH9qk}z~f1_xJc5`?{x|6bzw9+lcy$nJ1_gfn2IABO zj~SJcUM9`buQ6u`04FwDKPS1j{(=sKPH||W-CK~^4e)t^y%%NF$9QNjAV2_&T7@-R zuUIs`9S33-1MRC{PbYs?!2%HV_nMdU(lz6z5ND;*2Y7y?LldcVGy2r)JlYV-oF%&4 zpS7B%!ypeck74N; zp#g8FyKE-G8I;R>BZpE~Wa*f)OaCYZ8Dw!Fq@~_yQ=H?S$-OT&iWPrl4 z-QRA;9tPSdXI~zUuR__r%udqt4(W&|D?!r|s`V4v<+YU8j2S&cE3?@1-mDqPP2V#b z6u8SDs^Ok?^ut#F+yM20&b|^{)^vz!%l(bQcpXei!1=6kc_JHKSdb@|a3+8p;Ny+q zO!Z;!DG$STMkY!hWoA4Q^yAYt?fJo4naeUA4J%OO=NU=B?V$WcBKB~3uL4z8!Tudt zm94pG0=|qMv}(~97AsIQ#!zJbGvm>d?yJ;S60~$kM-p|T8Y+L6 zZYdGa7jVU^j$tjI0T>Ou87%SwxPW*8v<840R52R;34n0Mf#&~DM64M}RfV^fT7SY z{3+EbL8D)4cx<#z#+Q#_k@NGWZsvvC!0GEQ%6Fw}hPTs-a+!jD@CL@qj>Yl$vUDeMA%tg zbbGx7k{1XXr26JGfK+8>V7TY1>e^fk#uZeG)FA9G)QJe4IGL{VtZ+{h_@ndK6n5$P zfCN*%)Q_^xua+9_L!(vTiDCE+k&#Qe-5H?N0x$=!xbr(#b8r#)_J}OMObATsV?9Fw zsh&4$Y_1fymMU@ixPjqXVs7Y|Om5D%&ON7Sdx!j&FOFL+-#i0mRMiuXuP%v0Z-;er zO8$KE0oD-)gSdVyctf1aYEf=$p2Mz3J($mtIj?=#DYG;Dy8h1XuOk|b`>o2mPV3S8 zAo|M|x3V*JQYxdSwd}tLKM~hZwe1qQxnpcrekfjO@sAndM+N_IO29h55&JekV<`Fh z@FO@%*q)=}{>-HgT{4xfFDxcMT(FxcO6RM>!rZiZ6r9~NH}rB342%P&^EjiZC$N@! zfS&!`SPDV(>SH~aD~dq)KG(uiU%v|T8nm{MVi9pmp)8}$2g{0y+ATOpStVplu@f(> zSPIzBpQhF599C|nD)9-U;uJ4!4Nm?F8Sa^v$V{Z)^?a}OW7+N7HY*XF3lUm3N?b21 z?=_EsZ*6LSYA3ENBi{>i;Vk|;Q~jangh+YYtXGA_+Vv9*o%hiUtOrW&f}Zuwlh|$> zdOf9DT{SwFm=iXY_Pf-!BgbuIg1402A%hbS5V@J!9Qmr{V?lb_{ZdqgQ)d9Kx?Y0D zDR6+}*dF?6P(Kay?=)I6TGD1Ms|yu7VKL z)PS1-V8~=Kx(0|0gU#U7yZg;l+nPdu-q$^n_Zm{UunBX1m0QfhgG_)ob&y0pR1kaH z2D+#mc+Zl?_dw(f&+oHG9@aemezacRML-ZS*M1OTdN*@U|Tk&XRQ z=pnMV;fViyeTt;vJT4p9PqAj<+Nc0qxMI|B*!IKLH3U^uiwCd>ugwB-2@g9JKZIRd z4}h$^X;SJv3~SVAq2P7Rs%S(il9m+ApyS16qEV%3Fw8%e7+qp8kHP?U8Ivb%`; zZP%_hcsYpZLX2u%2ph5zpvCSfQLMW{v?lw(8J0f*m7bV6w5l#}S<$=V*ou$dvQSrD zUt)a$r3nEVUpW-^S7;Bc`q{Y$QkIeWz~sQzxSY4x*3AUyh zF0n6-oO>l*Z5<1I&FeubNpBPLlO`poUokLAaw=$DF{xTUKFxn;`&eSiy8AY;u)JSd zruw#~Iw8Mn_zf=kKE+bU;e#Cwurvnw?dn4(4O4hNtyV6K8V(@SA0)3$vo2c#(J~Xv z8K30g`HZi4D>PwmrG+~<`9TPkk5`qJVc~S|oTH@gi6I1y4V`pl})cmp~7*NZTLg1$pK@>rZ zyLmuABBFA33Dj`6dho!zK;ScAf5x%~5j1-cD*{NyPFU2Lq&&f=A(4-QqvVK+VAJj= zKwKp>n*(;M3qu{B@@uzSrra1%n=xIq#dk;*U{_2ww5C+3Fq0rc*ax!Jhqt~NXpJef zzHB!lpU9Wt?6NCmGRcdf82O$+Tu}{^mCEq{J;D7O%e`4#QCOR!c?=r7;_^|u#5}(M zEg3984NQGu+M3a?u>$VHZw4dPqm&Ohnq4wx3s3SGFFv755|QQLl?%MiPVLhD58lv{ z9yUHuW`9BFHCw5vJ>U2j6>8$e*s#%!*G{GuDja|jkaP~XfxIB9PeBJSId!_^{VVg#9#3bF z+FoB*_|Q|+XLnWn2fWhtQ5%3f)N43o?-n=`FudIaz3xkGBKT%79eMB!f7;@?T-^ei zxoadeCO#d)IAk5Zu#xUEC{l6Aa>MmAbtwC81p1b0;8p!ClUn}KsG}V2%@DnDM>y); z5yym)+R@jWQpB>YVjm%MuU=j`cqlsTSmG~~ zicf94F=y*fq>m!LU0JJv)~}7`cRR3-GzVk^ONjhS?1!`-&L1N zweNc%QX+HwiN&g}?om&`WhE8svUk!&0-z{e*?R2NK_zaqW9;qQC7-f{-1lmSbGYTP zS$y12q61RPYQ+uN2$du!EQ!klbEY}XN#ru&rCwYSf~xMhA`~m&fwGE^8Y1sIhlf^p zS0F3$Vhnk4S97~N%l#WNm|drR#byxak9wPFQpt6soq0{ zC5j)XB5iAF%d!7X?7W`heK)D8}QL9}~zF&nRqM{f{ z`G~#h`5>w^3XiE_fElZ)%j5kSZ*~pZFW0T{_*nTq{Bn@#VL3U5C2cb8tRFs(nc^(h zm)*2t9)I<%T7J=SjlZ3K`|IJM?181o^7NXZYwUejVx0G43aH|csbiKCbr#X}$@|5( zvNP5Cwhg=pnxDwH4V(DIUdAbE@>-YBYMHmz7?wjW?(2L|QwfAE46Q+NN3~>OFkVhQ z;~6WDi&p|8&Ol9lHvI8M=lGzY5CEEa?0}Nap0R&xWT<^iGw)HcTG+qsEMq+)>Ph{r29{>jT<_qWR$ z8CYYo?%;0}B09Qc`TNG-4nor7_14~Ny^6I%_9!SNrC5d9%QD{;M+p%S@g1`#4v>9TlIDAC8fz-$BprM4q3{6BOf?a(VFw^ zhEF+(d+vtjUZLJwG^7cX?cbg(xlR_L@a0Y^0caFh_yDZ<@^feLI&Mol{FW##Ub3Ec zvcAu*2XxTvDujMN_;X}^7u-F9opU^!m?xc;3PSRrvjpwEI9q0eo5Y&NHtC zyZ)O!+y?yUc*!P306O-(4dsVTuXRtt)#!b;e~W;;Dlv4Vv<~trcuV!kH@xYWs~L`L z^^XOaQSE)iqb`{*a>khx`x7*kC8h_M1ce3!n_0h#3sRz18$GSRN_KKjg|E*s_2<0u zw3aIjJ7Zer>u$mxZM|)PO_#kHZ0j9!zH;aKlps2u=DoS~?;)ilmrMkb(K2g+u^BRd zd8V)tdv=n^y}O+KahF$R4BP1~`kjewlXxfqJs1u{qjjepJzZ{CiRGYDD<)dC5QSD0 zl6>R5lS^)mxrR$$>mAo(yV}olhai(&AIqw*cY2d!c%KIA>?sZKK49?KuSHg|O3%-T z8k&26<2sW!Uz1sdFsrOF6h>|odN)oT!p@1eKdN+N=|gs_`!BK~+v{q~?@^v7x#YVV z4S`H5Ke0z+gM>vKq5@tCYXr|Jq<6xcXo8ywv`Jm~sYl-9Yn7v^ zWv-M73VF~9X}mR1){r&^c?oI-hTh+Dtj2U4puq0_MQHwYxx0axm@L6kl|Rwx*{>^s zQq$2ik?jCDN1O>EjdibEC+CapZ@zWrkRi7#8CT1}J=r|vf2Zm(mbbQ>+^@|R>6CD? z#R-5w&@9($y!s%1KlD_yu}q2{h+l>z>qn;W?r`~N*DJE_$Wd%w0&9D>l4_+{Y)14{ z^OUTIN!gb3xt&&4e%(txJnQN91Vh0>a68v#z^}6*W?@&dpHRo2yDk06rChtOzFX9S zvj*EsN;*ARnBnQsr!v;e3x(Tqr;?gCyb~ndYr(zp^H!IRSc9&d+%Wca82qfT|47+) z!4=4cQwoF7)CTb>NqpZp_(cYh#d^y^rV%{GbYq10(LXFd%K$Ne1sxcqnb=$V`5dXkV0 ziadjkW!>HQ9U}YAAHeX7+84f3UT)2rrXoW9%nr*aY{Mia!j>U#UZ7@?0rY^QFG08m7Bvu zb~pg-Rd0P0+`4WQEMG1pp!K3Ac@>J1zV&}4tb5&SD@$6`d1Z*)_qEoi%Fy)0?--Y} z@zmo(+Cr})P6q9p!#xlJOj5v;18NSJUs{{-7Mi+%Cd)<15Yp#)k{Q|W+wHyQQ8@zw z3~3p%90FY~cFZJ|)kh|p*-M>R8}l=o#dTo#J$DAM7qH;`4S_mihScMeWzp*@lqK0M zbX)ItM|-}xYgRoJ{p=y=##9Q&?E@(VbG*?^H5!H^GxWS@-SxEVX}Hjzei8Vjvx#bN zo=L1>1~&=^)TL(N9YC@VwK^_zG-;<0QomQz98&^j@W>XY3hqTkD~zm1BI@m*d~Y|6 zCh^x%{iSGRQWGYN>xB)YN%CnGsD!nUS~sAI&O05db6ywvY>4Y@>ulb z9E&ab`2@7Yap7zty}g%Z+Xba>x-Tex#{iQw1hY!M*`EulnSgxvVfxTx;_cBM|3aMB zs-8h+F89__B}kM~K47qCr8kcs#w;i79N?mZ8+zUUq%6u3Cil?rmTF>6e439IRJ(s} zeyR0!nalL=S|Hp1x4$3QCq5@beJFa6^n4CVlt(jvB7>@}*XEiz ztHmhZztyH0$VnbnP?+ltm?7x|WWs#!DmhB1DI=*IY~^8!qI9%r#jG>Ql_PteBGhc1 z+;%=l=#aZy7){PdPyZ_-vAPM!fhQA+Os4B%0K}0!`kyFm5bW=CzHUP}Rr2e<{FR&l zHa`$90Ef$`j~WfSzqUY5`-=={hBPz&!<;}yAU0KIUbvncxAPTa$iOF3g^P$txpa^dWI%d$`@WBH(^A`uVFsD94gBEno2&2_fLOL z;(6SP{8`b%xpp{jIZ3enf)|WSjGZ?4CR?+(K@6f$)D9J_p@x(G$hHa3+=V5`Vs(J1%JrLZMqA_16@sh+*@VQkpDKS9u(D0;t^?+2LY!QTKD z_+;YHkFFdojW3rzYo%>4{;z0RmCW7B(eQeY_WUO^b>J||x|jSr((F48TQQr16Y}QC zPsODvIDlk!kCEFUfCPTBRIvw|I2@NB0Qvl9LYeA(%#(FM)(B>)6h49d7v14+Ew|lS z07!(j?Gml4*k&cL3D}@McD?Waa%b?a9gg@t?+2mGJA)h*hjb;cRvwG;Z{kjOV#G*TZcG!k2>q#>GA0|HmJZ43QJq1;GlHYm+_H4mQst0D{DP z<;>|LG*74|W^!4D5h&MNGqdwnOBGjnK>{5DA80@st*v|?kBXd8WRtyIT)@IC%>ns7 z*?DAuE%> zYkno-IP8?ddI(Htm`OI7bQBtPwuS2+*4xQKLo2I!Ul6Wl>Mpf~X{)@7O*)U5!d9p- zN>KGN(?8nXOw^v;6One?IMf~N-l3~OBND`Zv%Wj40{9=wIx4C(XZh=62)iz=mZ-kH zw+&#;h(O;tm$g3~o%#5x^Bf|pJ7?dqomb$m!tJ_-{JyjW7$mo#+1_oZxnJ{QmcEQS z_QW~Qz7+M|=$l@q@hB>acHY8KQUb}UpK>^7z2C)9=2_Zv31!BU!`7>CGA6O2mbw0b zA7^dy%seN4)t$f7&tPMdK-IpE&-_WoG7Wo`Y%7g{ACvgeJTbtwM!Aq&o00GMoWl7n ze}wQ{ZuOQqO?(i#DS+2`E3lN?TFLiJpH!?sIyF8VEjjj_)KpZwsikTd(7{3wjB8#^ zDj_oxJ$-7eJY;-YWyN@6s|J)4;=-N#0&sRK;NO+fwiZGm56341c!wQ4ri0|9m-KHL zU|M)Ao`Da{l!cR0M1nH$B0)Az+Y^lBr9e#vFJ*1g9*Qg+>Fj=g)HkgF+pJ^?U_+;) za#8uQlm1TcZ|yeOj)Zu0L44wGt#Mj|?iur(prYh!0plmVKAS@m|6LCH-*8cwPtm4X z^ijY0b)^CJN&ZOf9^)uai!kyi`>@6Lv@%TW9ph+1EIOpgCC7UFpV{aC%xd{pPFtV? zIV`N(B1~`_kK#Wm$cVIPcg}nzWbvV_ z2;dBT4Q?i+5o$-uD~65p@qt0l@R?Ra9CxL!Kf7S6^uEb4NqgFRYFuM;fxnzV8ddwq z1;VsVK7vCJ!|$|V*Ee4;#DSnk!MlSq_C|p|W9&*GQtO2H-uRpe27pndorM!lZ~7TJ z5DsD5mgC-d8?${-3KlR&F(1WMt;(t`a9iJtIo{E^BX(b8QLTGa&>Te&XDvPXoHALN z=DW5Om9G8dL8%-nz5>lg>no;984CYNt9iR6u(DAjdl9la==H+x56LCr)?fwX_u%@G zb;m2ehquLhOjxO;i}acIzN;Ddg!!sT{%mTJF?e*1-5$9$E_?$(lAK(4ak_#}Z;sF4 zbtAp_MQ<=-5f}v1^#<2Xrc!n3(=tm!Ki7uW%g+-bp^&kp!ESYKvSwhL4e{P^*jMuS zvcP0~hd5%>*_m;wAA2vIJU(5$5u4m1k6~6Ky^_hQl$tlFM77{}CdS50VHYY^`)R`t z*_dV3oAS!Pp^?Y=<(Ofmod9nNz<-%vAvXt`ekyt|{AECq`!d&~JXo5M$kkkSxAa&$ zb9Qk1!!yy>k4=2>1Q6r6VE}(}b1(=F;A_2J$_ke_V4zoI8DpgR`pojimMD00i*xN2 z@{SQD59zh8(UV~wXW1q4b2CF&1t zP8+wrl>L^Zk_J~p3Hcy!x=wcLdt+vq;|5x4hfY$r>4X^ABS{pw(5^C2RPakb*t3Cd z*6Oa=AI~0%D#X^eoKbL>p+_4v58BgN-v(rKCpSPH#QYzR?9%W-zEzZN_7Ak*TQLEX z+ZIho9KiWjE!c8%hzIJ@G;+Qp>aVLQM^%R3u}fIo1(`DD7dSddYz%px^qE%~uACUu zEQ(N(+){4hTILrSwbi~+;Vv62jgHs!SFc_gX$bpemt61H?p>F9m4UyoCWileb;#lj zg{?V~ZOGh;=Ew6+8B1MZ)fee>BBP#*gMyyjq~F2)R9T80O;jppj=X29wwL0?ldP{~v~Va2R{yGAD4`>NtrLp) zAinl%_s)dTX&NRm#lvT3RRuhA%eP4GH=OLMr!HVQZzZXi`E8GueZ$y1m zyAtvCOG0FEXI4n0!A#5I8R-wc94L}+6X4jb5%#lRUe*haGa-|VBh2$M&=R+CK6#M} z!7zZ{NG;b?1U3qYd2bZwDQ^PQ3PP{kDNul%4la8fYy7-a`1IfUj7+iTgKNjCp|LmKdS~w{@j==hiM*OfujyW=v{^Xd$!%b;?lnfhkvMf7JD^U zE?6_aZ!M((BpaKT|AA!lKOp2ew#CZ7u%fD=S~vWk@__!kf#mxP{h=H&%h;F#)FWZR z|D9F%KN{yhDTse7L%T9^gj$RzuKLf~^_<)LoPdr&;jF83JFuqv_+q_ao(^ndj#S5{ zWJ2BSqhu-9s!?5ccOsLmUWm#K@8-FWC33N^kZI)R#HstNxFfX4{J*04CXk)2aAzZb@B>4FWOh{GAlN40Mw$vvf@s4hW_8hen@sg0wcGijK_E!Gt8< z3!L*(0+OTro*ED|>Q;*qF27bf))tv&?|##}d+kGt>Dt`6LZX!0_See%9TWXBwBO{U zEHf`hChj}J)k4Z=7ok2we9$?7AlrQ;T_I{vI9(NWW1yRu#~FAjs*vqfq^PC}@EU`k zE}gx!e7n^5C4u8Un^iQUfTgGc5QL8mjh&0plurQ_IZ)RPtCDwwuuugqY3aptA`xCV zzCG>chY;R_JxdBOX;vChR8$1b&7N`^l;PrVF*@~4N^oEDb_6k5=ZaaFf*JgsV{`vp zL)~9VP{4f*)pGbNvM8LOCT(hyLx7-v#eg9Dj1|mZ#_DiZEa7X zci#TteA9ua>f2H14nXSNd>7R39Ux65pi{W5Pn~#qPUq)mZXgvgLvWiS;`{H+Z&J^G zCOl@Y&mEYA0^In{Uh7q{=Zi0>}>*|P5||KC&6N2YfB#G-7e09lOieewG*o)79c zfQ|iEcis88hB@EW3I6C0$BMe_eu!yOdL`+3)2Ex4C-xKU&#E*aa{`5DH=+hRRI;86 z*}?IA_q7*NJ~{M^@g9Y+MERaz&e5^U1O_o9ZSy{t9G%9nQ=dQjX-&$l82Dre&~4##o!SPnDaV(@cI1o zxw3qwun{}J$@!c`F6kx=gX5>?;QeOblcztpo^8&U;qkhsUD&C|QcUX6n0cW_aM4wf z;?6g1ClA{G62vx>PNPzNum@qyCGs}t68|dULW`*0o0wXzg=G(q#3bvtsnIk#PN1-% z#CZb8k1F>ZBLGU45I{SLTzdU;X~VQ`)0*aUx%`e=?@BhgIgF(b`ciU;UjEzFGRJ682ac~45L z8~b#&D>;j3^1!&K%a8geYA{KO46KnbyR9}_`%(%x{wHfGvO@Z}Ah_X99^u%}Db5FM zDlsWu+qm!>GS58%|%M$%xzfX^BAG%caiCd z-FNr2RX>Z!vb`<51L2Bt-r=$gs~n-wI5Q}G>|uTjAVk$KRRm~MPycGYfg0@iUWJt* zdC_zK-cBvb%wIo}p1SW~L%&u4U6h=Lhw7qLy%ABlf%&XE zBHG@=(e3kB8bNG{)3>vet`Et-7Utv-z3S*#UZom5hE;wkEYiMNbq4%cVS}$4VnR~B z81y3`eby+SDoA0zJeMLDy^2#V_o2RhsLaymSl^?kRG*wk>d3tlq1&4KuJdsCl`^LY zM?W+s7|3OOM4ZT=&wfc6CKAkI6~K?bBZ_0kx?aja}Ac|M#ULOSPcV^+(r zQDKLx9y}`8qund_(bl=hx_}1Rq%x)9+=fL2iH_*3AcQx7GW` zdGnT#OyFbo80~;^to8!Gl|kSbn40l(+^i0IrJ!6*mgc%P_q2y|S7_N<{20^7=hfIw z=qUyohK^Xi3(&(XX;Jn>pW+PFWj~d(%9wC(^ZJ$JCzHmDy^7K!_=(kTVkU0_>HG_} z*%CiULRT#Q^REEkT>NNo f5SJx=dY&ZVrni^v?F#(Z@sg6f+LJ Date: Wed, 31 Jul 2019 14:12:19 +0800 Subject: [PATCH 3/6] update darts_mode --- docs/en_US/GeneralNasInterfaces.md | 40 ++++++++++++++++++------------ 1 file changed, 24 insertions(+), 16 deletions(-) diff --git a/docs/en_US/GeneralNasInterfaces.md b/docs/en_US/GeneralNasInterfaces.md index 1cd3eb5ca9..4bbe4b24bd 100644 --- a/docs/en_US/GeneralNasInterfaces.md +++ b/docs/en_US/GeneralNasInterfaces.md @@ -1,6 +1,6 @@ # General Programming Interface for Neural Architecture Search (experimental feature) -_*This is an experimental feature, currently, we only implemented the general NAS programming interface. Weight sharing and one-shot NAS based on this programming interface will be supported in the following releases._ +_*This is an experimental feature, currently, we only implemented the general NAS programming interface. Weight sharing will be supported in the following releases._ Automatic neural architecture search is taking an increasingly important role on finding better models. Recent research works have proved the feasibility of automatic NAS, and also found some models that could beat manually designed and tuned models. Some of representative works are [NASNet][2], [ENAS][1], [DARTS][3], [Network Morphism][4], and [Evolution][5]. There are new innovations keeping emerging. However, it takes great efforts to implement those algorithms, and it is hard to reuse code base of one algorithm for implementing another. @@ -104,18 +104,20 @@ trial: + #choice: classic_mode, enas_mode, oneshot_mode + nasMode: enas_mode ``` -Similar to classic_mode, in enas_mode you need to specify a tuner for nas, as it also needs to receive subgraphs from tuner (or controller using the terminology in the paper). Since this trial job needs to receive multiple subgraphs from tuner, each one for a mini-batch, two lines need to be added to the trial code to receive the next subgraph and report the report the result of the current subgraph. Below is an example: +Similar to classic_mode, in enas_mode you need to specify a tuner for nas, as it also needs to receive subgraphs from tuner (or controller using the terminology in the paper). Since this trial job needs to receive multiple subgraphs from tuner, each one for a mini-batch, two lines need to be added to the trial code to receive the next subgraph (i.e., `nni.training_update`) and report the report the result of the current subgraph. Below is an example: ```python for _ in range(num): - """@nni.get_next_parameter(self.session)""" + # here receives and enables a new subgraph + """@nni.training_update(tf=tf, session=self.session)""" loss, _ = self.session.run([loss_op, train_op]) + # report the loss of this mini-batch """@nni.report_final_result(loss)""" ``` -Here, `get_next_parameter` needs an arg which is the session variable that you use to train your model. +Here, `nni.training_update` is to do some update on the full graph. In enas_mode, the update means receiving a subgraph and enabling it on the next mini-batch. While in darts_mode, the update means training the architecture weights (details in darts_mode). In enas_mode, you need to pass the imported tensorflow package to `tf` and the session to `session`. **oneshot_mode**: following the training approach in [this paper][6]. Different from enas_mode which trains the full graph by training large numbers of subgraphs, in oneshot_mode the full graph is built and dropout is added to candidate inputs and also added to candidate ops' outputs. Then this full graph is trained like other DL models. [Detailed Description](#OneshotMode). (currently only supported on tensorflow). -To use oneshot_mode, you should add one more field in the `trial` config as shown below. In this mode, no need to specify tuner in the config file as it does not need tuner. (Note that you still need to specify a tuner (any tuner) in the config file in v0.9, not anymore in future releases.) +To use oneshot_mode, you should add one more field in the `trial` config as shown below. In this mode, no need to specify tuner in the config file as it does not need tuner. (Note that you still need to specify a tuner (any tuner) in the config file in v0.9/v1.0, not anymore in future releases.) Also, no need to add `nni.training_update` in this mode, because no special processing (or update) is needed during training. ```diff trial: command: your command to run the trial @@ -125,9 +127,9 @@ trial: + nasMode: oneshot_mode ``` -**darts_mode**: following the training approach in [this paper][3]. It is similar to oneshot_mode. There are two differences, one is that darts_mode only add architecture weights to the outputs of candidate ops, the other is that it trains model weights and architecture weights in an interleaved manner. [Detailed Description](#DartsMode). (not supported yet). +**darts_mode**: following the training approach in [this paper][3]. It is similar to oneshot_mode. There are two differences, one is that darts_mode only add architecture weights to the outputs of candidate ops, the other is that it trains model weights and architecture weights in an interleaved manner. [Detailed Description](#DartsMode). -To use darts_mode, you should add one more field in the `trial` config as shown below. In this mode, also no need to specify tuner in the config file as it does not need tuner. +To use darts_mode, you should add one more field in the `trial` config as shown below. In this mode, also no need to specify tuner in the config file as it does not need tuner. (Note that you still need to specify a tuner (any tuner) in the config file in v0.9/v1.0, not anymore in future releases.) ```diff trial: command: your command to run the trial @@ -137,6 +139,14 @@ trial: + nasMode: darts_mode ``` +When using darts_mode, you need to call `nni.training_update` as shown below when architecture weights should be updated. Updating architecture weights need `loss` for updating the weights as well as the training data (i.e., `feed_dict`) for it. +```python +for _ in range(num): + # here trains the architecture weights + """@nni.training_update(tf=tf, session=self.session, loss=loss, feed_dict=feed_dict)""" + loss, _ = self.session.run([loss_op, train_op]) +``` + **Note:** for enas_mode, oneshot_mode, and darts_mode, NNI only works on the training phase. They also have their own inference phase which is not handled by NNI. For enas_mode, the inference phase is to generate new subgraphs through the controller. For oneshot_mode, the inference phase is sampling new subgraphs randomly and choosing good ones. For darts_mode, the inference phase is pruning a proportion of candidates ops based on architecture weights. @@ -145,7 +155,7 @@ trial: In enas_mode, the compiled trial code builds the full graph (rather than subgraph), it receives a chosen architecture and training this architecture on the full graph for a mini-batch, then request another chosen architecture. It is supported by [NNI multi-phase](./multiPhase.md). -Specifically, for trials using tensorflow, we create and use tensorflow variable as signals, and tensorflow conditional functions to control the search space (full-graph) to be more flexible, which means it can be changed into different sub-graphs (multiple times) depending on these signals. +Specifically, for trials using tensorflow, we create and use tensorflow variable as signals, and tensorflow conditional functions to control the search space (full-graph) to be more flexible, which means it can be changed into different sub-graphs (multiple times) depending on these signals. [Here]() is an example for enas_mode. @@ -155,7 +165,7 @@ Below is the figure to show where dropout is added to the full graph for one lay ![](../img/oneshot_mode.png) -As suggested in the [paper][6], a dropout method is implemented to the inputs for every layer. The dropout rate is set to r^(1/k), where 0 < r < 1 is a hyper-parameter of the model (default to be 0.01) and k is number of optional inputs for a specific layer. The higher the fan-in, the more likely each possible input is to be dropped out. However, the probability of dropping out all optional_inputs of a layer is kept constant regardless of its fan-in. Suppose r = 0.05. If a layer has k = 2 optional_inputs then each one will independently be dropped out with probability 0.051/2 ≈ 0.22 and will be retained with probability 0.78. If a layer has k = 7 optional_inputs then each one will independently be dropped out with probability 0.051/7 ≈ 0.65 and will be retained with probability 0.35. In both cases, the probability of dropping out all of the layer's optional_inputs is 5%. The outputs of candidate ops are dropped out through the same way. +As suggested in the [paper][6], a dropout method is implemented to the inputs for every layer. The dropout rate is set to r^(1/k), where 0 < r < 1 is a hyper-parameter of the model (default to be 0.01) and k is number of optional inputs for a specific layer. The higher the fan-in, the more likely each possible input is to be dropped out. However, the probability of dropping out all optional_inputs of a layer is kept constant regardless of its fan-in. Suppose r = 0.05. If a layer has k = 2 optional_inputs then each one will independently be dropped out with probability 0.051/2 ≈ 0.22 and will be retained with probability 0.78. If a layer has k = 7 optional_inputs then each one will independently be dropped out with probability 0.051/7 ≈ 0.65 and will be retained with probability 0.35. In both cases, the probability of dropping out all of the layer's optional_inputs is 5%. The outputs of candidate ops are dropped out through the same way. [Here]() is an example for oneshot_mode. @@ -165,11 +175,11 @@ Below is the figure to show where architecture weights are added to the full gra ![](../img/darts_mode.png) -More detailed description after this mode is supported. +In `nni.training_update`, tensorflow MomentumOptimizer is used to train the architecture weights based on the pass `loss` and `feed_dict`. [Here]() is an example for darts_mode. -### Multiple trial jobs for one-shot NAS +### [__TODO__] Multiple trial jobs for One-Shot NAS -One-Shot NAS usually has only one trial job with the full graph. However, running multiple such trial jobs leads to benefits. For example, in enas_mode multiple trial jobs could share the weights of the full graph to speedup the model training (or converge). Some one-shot approaches are not stable, running multiple trial jobs increase the possibility of finding better models. +One-Shot NAS usually has only one trial job with the full graph. However, running multiple such trial jobs leads to benefits. For example, in enas_mode multiple trial jobs could share the weights of the full graph to speedup the model training (or converge). Some One-Shot approaches are not stable, running multiple trial jobs increase the possibility of finding better models. NNI natively supports running multiple such trial jobs. The figure below shows how multiple trial jobs run on NNI. @@ -200,11 +210,9 @@ We believe weight sharing (transferring) plays a key role on speeding up NAS, wh Example of weight sharing on NNI. -## [__TODO__] General tuning algorithms for NAS - -Like hyperparameter tuning, a relatively general algorithm for NAS is required. The general programming interface makes this task easier to some extent. We have a RL-based tuner algorithm for NAS from our contributors. We expect efforts from community to design and implement better NAS algorithms. +## General tuning algorithms for NAS -More tuning algorithms for NAS. +Like hyperparameter tuning, a relatively general algorithm for NAS is required. The general programming interface makes this task easier to some extent. We have an [RL tuner based on PPO algorithm](https://github.com/microsoft/nni/tree/master/src/sdk/pynni/nni/ppo_tuner) for NAS. We expect efforts from community to design and implement better NAS algorithms. ## [__TODO__] Export best neural architecture and code From 0ca5614c74651eeee16ee4214d82de485c2009f2 Mon Sep 17 00:00:00 2001 From: quzha Date: Wed, 31 Jul 2019 14:16:53 +0800 Subject: [PATCH 4/6] update --- docs/en_US/GeneralNasInterfaces.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/en_US/GeneralNasInterfaces.md b/docs/en_US/GeneralNasInterfaces.md index 4bbe4b24bd..bf90aeef73 100644 --- a/docs/en_US/GeneralNasInterfaces.md +++ b/docs/en_US/GeneralNasInterfaces.md @@ -91,9 +91,9 @@ One-Shot NAS is a popular approach to find good neural architecture within a lim NNI has supported the general NAS as demonstrated above. From users' point of view, One-Shot NAS and NAS have the same search space specification, thus, they could share the same programming interface as demonstrated above, just different training modes. NNI provides four training modes: -**classic_mode**: this mode is described [above](#ProgInterface), in this mode, each subgraph runs as a trial job. To use this mode, you should enable NNI annotation and specify a tuner for nas in experiment config file. [Here](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-nas) is an example to show how to write trial code and the config file. And [here](https://github.com/microsoft/nni/tree/master/examples/tuners/random_nas_tuner) is a simple tuner for nas. +**\*classic_mode\***: this mode is described [above](#ProgInterface), in this mode, each subgraph runs as a trial job. To use this mode, you should enable NNI annotation and specify a tuner for nas in experiment config file. [Here](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-nas) is an example to show how to write trial code and the config file. And [here](https://github.com/microsoft/nni/tree/master/examples/tuners/random_nas_tuner) is a simple tuner for nas. -**enas_mode**: following the training approach in [enas paper][1]. It builds the full graph based on neural architrecture search space, and only activate one subgraph that generated by the controller for each mini-batch. [Detailed Description](#ENASMode). (currently only supported on tensorflow). +**\*enas_mode\***: following the training approach in [enas paper][1]. It builds the full graph based on neural architrecture search space, and only activate one subgraph that generated by the controller for each mini-batch. [Detailed Description](#ENASMode). (currently only supported on tensorflow). To use enas_mode, you should add one more field in the `trial` config as shown below. ```diff @@ -115,7 +115,7 @@ for _ in range(num): ``` Here, `nni.training_update` is to do some update on the full graph. In enas_mode, the update means receiving a subgraph and enabling it on the next mini-batch. While in darts_mode, the update means training the architecture weights (details in darts_mode). In enas_mode, you need to pass the imported tensorflow package to `tf` and the session to `session`. -**oneshot_mode**: following the training approach in [this paper][6]. Different from enas_mode which trains the full graph by training large numbers of subgraphs, in oneshot_mode the full graph is built and dropout is added to candidate inputs and also added to candidate ops' outputs. Then this full graph is trained like other DL models. [Detailed Description](#OneshotMode). (currently only supported on tensorflow). +**\*oneshot_mode\***: following the training approach in [this paper][6]. Different from enas_mode which trains the full graph by training large numbers of subgraphs, in oneshot_mode the full graph is built and dropout is added to candidate inputs and also added to candidate ops' outputs. Then this full graph is trained like other DL models. [Detailed Description](#OneshotMode). (currently only supported on tensorflow). To use oneshot_mode, you should add one more field in the `trial` config as shown below. In this mode, no need to specify tuner in the config file as it does not need tuner. (Note that you still need to specify a tuner (any tuner) in the config file in v0.9/v1.0, not anymore in future releases.) Also, no need to add `nni.training_update` in this mode, because no special processing (or update) is needed during training. ```diff @@ -127,7 +127,7 @@ trial: + nasMode: oneshot_mode ``` -**darts_mode**: following the training approach in [this paper][3]. It is similar to oneshot_mode. There are two differences, one is that darts_mode only add architecture weights to the outputs of candidate ops, the other is that it trains model weights and architecture weights in an interleaved manner. [Detailed Description](#DartsMode). +**\*darts_mode\***: following the training approach in [this paper][3]. It is similar to oneshot_mode. There are two differences, one is that darts_mode only add architecture weights to the outputs of candidate ops, the other is that it trains model weights and architecture weights in an interleaved manner. [Detailed Description](#DartsMode). To use darts_mode, you should add one more field in the `trial` config as shown below. In this mode, also no need to specify tuner in the config file as it does not need tuner. (Note that you still need to specify a tuner (any tuner) in the config file in v0.9/v1.0, not anymore in future releases.) ```diff From 743e5fa65b3cb5fac58966b73bdd0eeab327714b Mon Sep 17 00:00:00 2001 From: quzha Date: Wed, 31 Jul 2019 14:24:43 +0800 Subject: [PATCH 5/6] fix broken links --- .../AdvancedFeature/GeneralNasInterfaces.md | 24 +++---------------- 1 file changed, 3 insertions(+), 21 deletions(-) diff --git a/docs/en_US/AdvancedFeature/GeneralNasInterfaces.md b/docs/en_US/AdvancedFeature/GeneralNasInterfaces.md index 72dc647a7b..54e2a4b67c 100644 --- a/docs/en_US/AdvancedFeature/GeneralNasInterfaces.md +++ b/docs/en_US/AdvancedFeature/GeneralNasInterfaces.md @@ -163,7 +163,7 @@ Specifically, for trials using tensorflow, we create and use tensorflow variable Below is the figure to show where dropout is added to the full graph for one layer in `nni.mutable_layers`, input 1-k are candidate inputs, the four ops are candidate ops. -![](../img/oneshot_mode.png) +![](../../img/oneshot_mode.png) As suggested in the [paper][6], a dropout method is implemented to the inputs for every layer. The dropout rate is set to r^(1/k), where 0 < r < 1 is a hyper-parameter of the model (default to be 0.01) and k is number of optional inputs for a specific layer. The higher the fan-in, the more likely each possible input is to be dropped out. However, the probability of dropping out all optional_inputs of a layer is kept constant regardless of its fan-in. Suppose r = 0.05. If a layer has k = 2 optional_inputs then each one will independently be dropped out with probability 0.051/2 ≈ 0.22 and will be retained with probability 0.78. If a layer has k = 7 optional_inputs then each one will independently be dropped out with probability 0.051/7 ≈ 0.65 and will be retained with probability 0.35. In both cases, the probability of dropping out all of the layer's optional_inputs is 5%. The outputs of candidate ops are dropped out through the same way. [Here]() is an example for oneshot_mode. @@ -173,7 +173,7 @@ As suggested in the [paper][6], a dropout method is implemented to the inputs fo Below is the figure to show where architecture weights are added to the full graph for one layer in `nni.mutable_layers`, output of each candidate op is multiplied by a weight which is called architecture weight. -![](../img/darts_mode.png) +![](../../img/darts_mode.png) In `nni.training_update`, tensorflow MomentumOptimizer is used to train the architecture weights based on the pass `loss` and `feed_dict`. [Here]() is an example for darts_mode. @@ -183,7 +183,7 @@ One-Shot NAS usually has only one trial job with the full graph. However, runnin NNI natively supports running multiple such trial jobs. The figure below shows how multiple trial jobs run on NNI. -![](../img/one-shot_training.png) +![](../../img/one-shot_training.png) ============================================================= @@ -209,24 +209,6 @@ We believe weight sharing (transferring) plays a key role on speeding up NAS, wh Example of weight sharing on NNI. -<<<<<<< HEAD:docs/en_US/GeneralNasInterfaces.md -======= -### [__TODO__] Support of One-Shot NAS - -One-Shot NAS is a popular approach to find good neural architecture within a limited time and resource budget. Basically, it builds a full graph based on the search space, and uses gradient descent to at last find the best subgraph. There are different training approaches, such as [training subgraphs (per mini-batch)][1], [training full graph through dropout][6], [training with architecture weights (regularization)][3]. Here we focus on the first approach, i.e., training subgraphs (ENAS). - -With the same annotated trial code, users could choose One-Shot NAS as execution mode on NNI. Specifically, the compiled trial code builds the full graph (rather than subgraph demonstrated above), it receives a chosen architecture and training this architecture on the full graph for a mini-batch, then request another chosen architecture. It is supported by [NNI multi-phase](MultiPhase.md). We support this training approach because training a subgraph is very fast, building the graph every time training a subgraph induces too much overhead. - -![](../../img/one-shot_training.png) - -The design of One-Shot NAS on NNI is shown in the above figure. One-Shot NAS usually only has one trial job with full graph. NNI supports running multiple such trial jobs each of which runs independently. As One-Shot NAS is not stable, running multiple instances helps find better model. Moreover, trial jobs are also able to synchronize weights during running (i.e., there is only one copy of weights, like asynchronous parameter-server mode). This may speedup converge. - -Example of One-Shot NAS on NNI. - - -## [__TODO__] General tuning algorithms for NAS ->>>>>>> 410ab1ca20ceeb8237d608446a9265b6f24a418f:docs/en_US/AdvancedFeature/GeneralNasInterfaces.md - ## General tuning algorithms for NAS Like hyperparameter tuning, a relatively general algorithm for NAS is required. The general programming interface makes this task easier to some extent. We have an [RL tuner based on PPO algorithm](https://github.com/microsoft/nni/tree/master/src/sdk/pynni/nni/ppo_tuner) for NAS. We expect efforts from community to design and implement better NAS algorithms. From 9c9ee85844c0563dd66962fd81bf390b0ad72e94 Mon Sep 17 00:00:00 2001 From: quzha Date: Thu, 1 Aug 2019 08:37:54 +0800 Subject: [PATCH 6/6] fix comments, update search space format --- .../AdvancedFeature/GeneralNasInterfaces.md | 23 +++++++++++-------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/docs/en_US/AdvancedFeature/GeneralNasInterfaces.md b/docs/en_US/AdvancedFeature/GeneralNasInterfaces.md index 54e2a4b67c..1c53205d1f 100644 --- a/docs/en_US/AdvancedFeature/GeneralNasInterfaces.md +++ b/docs/en_US/AdvancedFeature/GeneralNasInterfaces.md @@ -55,13 +55,16 @@ After finishing the trial code through the annotation above, users have implicit ```javascript { "mutable_1": { - "layer_1": { - "layer_choice": ["conv(ch=128)", "pool", "identity"], - "optional_inputs": ["out1", "out2", "out3"], - "optional_input_size": 2 - }, - "layer_2": { - ... + "_type": "mutable_layer", + "_value": { + "layer_1": { + "layer_choice": ["conv(ch=128)", "pool", "identity"], + "optional_inputs": ["out1", "out2", "out3"], + "optional_input_size": 2 + }, + "layer_2": { + ... + } } } } @@ -104,7 +107,7 @@ trial: + #choice: classic_mode, enas_mode, oneshot_mode + nasMode: enas_mode ``` -Similar to classic_mode, in enas_mode you need to specify a tuner for nas, as it also needs to receive subgraphs from tuner (or controller using the terminology in the paper). Since this trial job needs to receive multiple subgraphs from tuner, each one for a mini-batch, two lines need to be added to the trial code to receive the next subgraph (i.e., `nni.training_update`) and report the report the result of the current subgraph. Below is an example: +Similar to classic_mode, in enas_mode you need to specify a tuner for nas, as it also needs to receive subgraphs from tuner (or controller using the terminology in the paper). Since this trial job needs to receive multiple subgraphs from tuner, each one for a mini-batch, two lines need to be added to the trial code to receive the next subgraph (i.e., `nni.training_update`) and report the result of the current subgraph. Below is an example: ```python for _ in range(num): # here receives and enables a new subgraph @@ -117,7 +120,7 @@ Here, `nni.training_update` is to do some update on the full graph. In enas_mode **\*oneshot_mode\***: following the training approach in [this paper][6]. Different from enas_mode which trains the full graph by training large numbers of subgraphs, in oneshot_mode the full graph is built and dropout is added to candidate inputs and also added to candidate ops' outputs. Then this full graph is trained like other DL models. [Detailed Description](#OneshotMode). (currently only supported on tensorflow). -To use oneshot_mode, you should add one more field in the `trial` config as shown below. In this mode, no need to specify tuner in the config file as it does not need tuner. (Note that you still need to specify a tuner (any tuner) in the config file in v0.9/v1.0, not anymore in future releases.) Also, no need to add `nni.training_update` in this mode, because no special processing (or update) is needed during training. +To use oneshot_mode, you should add one more field in the `trial` config as shown below. In this mode, no need to specify tuner in the config file as it does not need tuner. (Note that you still need to specify a tuner (any tuner) in the config file for now.) Also, no need to add `nni.training_update` in this mode, because no special processing (or update) is needed during training. ```diff trial: command: your command to run the trial @@ -129,7 +132,7 @@ trial: **\*darts_mode\***: following the training approach in [this paper][3]. It is similar to oneshot_mode. There are two differences, one is that darts_mode only add architecture weights to the outputs of candidate ops, the other is that it trains model weights and architecture weights in an interleaved manner. [Detailed Description](#DartsMode). -To use darts_mode, you should add one more field in the `trial` config as shown below. In this mode, also no need to specify tuner in the config file as it does not need tuner. (Note that you still need to specify a tuner (any tuner) in the config file in v0.9/v1.0, not anymore in future releases.) +To use darts_mode, you should add one more field in the `trial` config as shown below. In this mode, also no need to specify tuner in the config file as it does not need tuner. (Note that you still need to specify a tuner (any tuner) in the config file for now.) ```diff trial: command: your command to run the trial