-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggest integration timestep #208
Conversation
c9be466
to
21ab47f
Compare
Intended to address first part of #149. As currently implemented, this PR collects statistics of Another question is how to define an interpretable error tolerance parameter, Prior to merging, we should also add support for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it will be very useful.
Perhaps the collection of statistics should be something that a user consciously chooses to do. I assume also there's some small performance hit associated with calculating the errors.
# for some empirical constants c₁ and c₂. | ||
|
||
c1 = 1.0 | ||
c2 = 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it correct to assume that the appropriate constants are model-dependent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. We are trying to find a mapping between error and dt, but this depends on the model and the quantity used to define "error". Since it's fundamentally ambiguous, I defaulted to a simple choice here, which seemed to work well for our examples.
The ratio between c2/c1 particularly hard to select. It seems natural that, when T gets "big enough", we want dt to scale like T at a fixed tolerance. But how should we define "big enough" relative to other energy scales in the system?
examples/03_LLD_CoRh2O4.jl
Outdated
# fluctuation-dissipation theorem. | ||
|
||
Δt = 0.05/abs(J*S) # Integration timestep | ||
λ = 0.2 # Dimensionless damping time-scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the motivation for taking λ to 0.2? More reasonable decorrelation times? (I think I used 0.2 for FeI2.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is problem dependent, but I suspect that larger λ speeds decorrelation in the sense that "gradient descent" becomes more efficient. But in other ways it could slow things, because you're damping out spin waves that may carry information ballistically. Note also that the Heun integrator is not symplectic, and so the "energy conserving" part of the dynamics is not really energy conserving. This means statistical accuracy will degrade when λ gets smaller, but I don't know how strong this effect is in practice.
examples/03_LLD_CoRh2O4.jl
Outdated
@@ -46,6 +50,11 @@ for _ in 1:1000 | |||
end | |||
plot(energies, color=:blue, figure=(size=(600,300),), axis=(xlabel="Time steps", ylabel="Energy (meV)")) | |||
|
|||
# After running a Langevin trajectory, it is a good practice to call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asked about this on slack. It seems reasonable that a trajectory should be collected for statistics. I guess there are some question about whether this implicit definition of a trajectory is the best interface. Will comment more outside of code.
src/Integrators.jl
Outdated
s0 = sys.κs[i] # == norm(s[i]) | ||
∇E² = ∇E[i]' * ∇E[i] | ||
accum!(integrator.drift_squared, (1 + (s0*λ)^2) * ∇E²) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you benched the effect of this? Perhaps it should be optional? (I imagine the cost is much less than the calculation of the gradient, but I'm curious about the rough order of magnitude.)
I think that collecting statistics for a chosen time steps is reasonable, even if in an ideal world a user would like to have a function that takes one step and gets a suggestion. I assume the statistics will vary a lot depending on whether the system is equilibrated or not, and the user should probably be encouraged to do a small study for the situation that they are interested in studying. I think the report that this provides is very useful. |
A worst case scenario is maybe a single-site problem, where Probably for a system with multiple sites, a random spin configuration will give a good sense of the typical dt required. Conversely, for a local Monte-Carlo sampler, I don't see anyway around the need to collect statistics over a trajectory. Given that this type of interface must exist, it might make sense to use it consistently for Langevin sampling. |
9402b91
to
5ea1d65
Compare
I tested the alternate implementation, which suggests Results for the dynamics in our two main tutorials. The suggested Δt below varies according to the type of spin snapshots used in the estimation: Tutorial03, CoRh2O4 Δt ~ 0.02528 (Polarized in z) Tutorial04, FeI2 Δt ~ 0.1482 (Polarized in z) Key points:
I think a reasonable advice to users is "first run |
@ddahlbom pointed out that there might be a different trend with frustrated systems. This can be verified in the SpinW tutorial: SW08, Kagome antiferro: Δt ~ 0.025 (Polarized in z) In this case, both energy minimizing and random spin configurations lead to about the same suggested dt, but curiously the "spin polarized" configuration requires dt smaller by a factor of 1/2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good. It makes it very easy for users.
I wonder if a factor of two or so should be applied to the ImplicitMidpoint recommendation.
Relatedly, it seems like it would be useful to study the number of steps to converge when using ImplicitMidpoint as another reference point. I believe there is a point when the number of steps it takes to converge starts growing quite rapidly, and a good choice is generally to pick a time step as large as possible before that growth kicks in. But that is a more complicated analysis. Whatever is recommended in this version would certainly be safe, I think.
@@ -0,0 +1,27 @@ | |||
|
|||
mutable struct OnlineStatistics{T} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan to keep this around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could later be useful in other places, but I removed the include
from Sunny.jl.
end | ||
function suggest_timestep(sys::System{N}, integrator::ImplicitMidpoint; tol) where N | ||
(; Δt) = integrator | ||
suggest_timestep_aux(sys; tol, Δt, λ=0, kT=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to put a factor of 2 in here? That has been the rule of thumb I've generally used for setting the ImplicitMidpoint time step.
# With this increase in temperature, the suggested timestep has increased slightly. | ||
|
||
suggest_timestep(sys_large, langevin; tol=1e-2) | ||
langevin.Δt = 0.040; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this interface is good. I guess the pattern is new in Sunny: allowing a field to be left unset in a struct and requiring that it be set before use. But I think I prefer this to, say, having a setter function.
|
||
sc = dynamical_correlations(sys_large; Δt=2Δt, nω=120, ωmax=7.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the two is left here.
From an interface standpoint, I guess it's a bit confusing, because in the typical use case the user never sets a time step for ImplicitMidpoint. But they have the option of estimating if they so choose, and that estimate will be half of what is used here.
This should be used after a time-integration to verify expected accuracy.
95db9dd
to
a22591f
Compare
Collect statistics from numerical integration to provide suggestions about the choice of dt.