Skip to content

Simple Tensorflow implementation of "On the Convergence of Adam and Beyond" (ICLR 2018)

License

Notifications You must be signed in to change notification settings

Yujun-Yan/AMSGrad-Tensorflow

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AMSGrad-Tensorflow

Simple Tensorflow implementation of On the Convergence of Adam and Beyond

Hyperparameter

  • For the default hyperparameter, we set it to the best value in the experiment
  • learning_rate = 0.01
  • beta1 = 0.9
  • beta2 = 0.99
  • Depending on which network you are using, performance may be good at beta2 = 0.99 (default)

Usage

  from AMSGrad import AMSGrad
  
  train_op = AMSGrad(learning_rate=0.01, beta1=0.9, beta2=0.99, epsilon=1e-8).minimize(loss)

Network Architecture

  x = fully_connected(inputs=images, units=100)
  x = relu(x)
  logits = fully_connected(inputs=x, units=10)

Mnist Result (iteration = 3M)

lr=0.1, beta1=0.9, beta2=various

 

lr=0.01, beta1=0.9, beta2=various

 

Reference

Author

Junho Kim

About

Simple Tensorflow implementation of "On the Convergence of Adam and Beyond" (ICLR 2018)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%