-
Notifications
You must be signed in to change notification settings - Fork 5
/
__init__.py
72 lines (72 loc) · 3.2 KB
/
__init__.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# 0.14.dev0:
# - fixed sigma_r
# - fixed conditioning number
# - defined multiple SUCRL algs.
# 0.15.dev0
# - fixed interface of EVI
# - updated Bernstein bound
# 0.16.dev0 (Nov 18, 2017)
# - fixed interface of EVI (.evi -> .run)
# - added discount factor
# - added recentering as function parameter
# - added reset function to reset u1 and u2
# - added option to perform relative value iteration
# - added Bias Span Constrained EVI (same structure of EVI)
# - added test accordingly to (Puterman, 1994) for Evi e SC-EVI
# - SUCRL and FSUCRL are unchanged
# - addet test of 2 states domain in article
# 0.18.dev0 (Nov 20, 2017)
# - changed schema for tie breaking in EVI
# * use random noise in [0, 1e-4]
# 0.19.dev0 (Nov 21, 2017)
# - implemented operator N as special case of SCEVI (run parameter)
# - added test for N as in article
# - fixed computation of convex combination
# - added check of policy in test_toy2d1
# 0.20.dev0 (Nov 22, 2017)
# - fix selection of action in SCEVI
# * when min and max actions have equal value, pick using noise
# - added RandomState to EVI and SCEVI initialized every time run is called
# - SCEVI has span_constraint and relative_vi as class attributes
# * they can be overloaded through run inputs
# - implemented Span Regularized UCRL
# - change simulation of policy in UCRL to account for stochastic policies
# - added reset function to environment to reset to clear state
# - added test for UCRL and SC-UCRL using toy 3D domain when span_c = inf
# 0.21.dev0 (Nov 25, 2017)
# - fix SC-EVI
# * minimum value for (s,a) is computed in a pessimistic way by
# taking min{R} and min {P u1}
# - optimized SC-EVI in order to compute the minimum value (s,a)
# only when required
# * every iteration for N and at convergence for T
# - updated test after SC-EVI update
# - updated max_proba in order to compute min{P u1} as max{P (-u1)}
# * it simply scan the vector in reverse order
# - added Dijkstra algorithm for the computation of the shortest path (and diameter)
# - added test for diameter
# 0.22.dev0 (Nov 26, 2017)
# - fix error in SC-EVI
# * added truncation of reward to r_max in pessimistic value computation
# - adde u3 and u3min as class attributes
# - added test for EVI using 3S-domain
# 0.24.dev0 (Jan 26, 2018)
# - renamed span constrained UCRL to SCAL
# - added augmentation of the reward for SCAL
# - fixed confidence intervals for bernstein inequality (not in the SMDP case)
# * added computation of variance of the reward with Welford's method
# - added Gaussian, Gamma, Beta and Exp reward distributions
# - modified Toy1 in order to have stochastic reward
# 0.25.dev0 (Jan 30, 2018)
# - fix error in scopt (scevi.pyx)
# * checked with valgrind
# - added river swim and updated navgrid and 4-rooms to use scal
# 0.26.dev0 (Jan 31, 2018)
# - added value iteration approach to compute diameter
# - added resource collection domain
# 0.27.dev0 (Feb 1, 2018)
# - added exception in evi when number of iterations is too big
# i.e., it > min(1M, ns*na*200)
# 0.28.dev0 (Feb 1, 2018)
# - fix error in evi and scevi due to random ties breaking
__version__ = '0.28.dev0'