Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggest integration timestep #208

Merged
merged 6 commits into from
Jan 16, 2024
Merged

Suggest integration timestep #208

merged 6 commits into from
Jan 16, 2024

Conversation

kbarros
Copy link
Member

@kbarros kbarros commented Dec 30, 2023

Collect statistics from numerical integration to provide suggestions about the choice of dt.

@kbarros kbarros requested a review from ddahlbom January 1, 2024 19:14
@kbarros
Copy link
Member Author

kbarros commented Jan 1, 2024

Intended to address first part of #149.

As currently implemented, this PR collects statistics of grad E over the course of a dynamical trajectory and stores them in the Langevin struct. An alternative implementation would have check_timestep take a System, and suggest a timestep based on a single grad E calculation. Switching to that alternative implementation would be simpler, and allows the possibility of getting a dt estimate prior to running dynamics. A disadvantage to switching is that the estimates might become worse for small systems where there are less statistics (?).

Another question is how to define an interpretable error tolerance parameter, tol. Since Heun is weak first-order accurate, the error in statistical observables likely decays as 1 / dt. It might be the case that if we switch to implicit-midpoint as our Langevin integrator, the error decays faster, as 1 / dt^2. We could determine this with some numerical experiments.

Prior to merging, we should also add support for ImplicitMidpoint.

@kbarros kbarros mentioned this pull request Jan 1, 2024
Copy link
Member

@ddahlbom ddahlbom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it will be very useful.

Perhaps the collection of statistics should be something that a user consciously chooses to do. I assume also there's some small performance hit associated with calculating the errors.

# for some empirical constants c₁ and c₂.

c1 = 1.0
c2 = 1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct to assume that the appropriate constants are model-dependent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. We are trying to find a mapping between error and dt, but this depends on the model and the quantity used to define "error". Since it's fundamentally ambiguous, I defaulted to a simple choice here, which seemed to work well for our examples.

The ratio between c2/c1 particularly hard to select. It seems natural that, when T gets "big enough", we want dt to scale like T at a fixed tolerance. But how should we define "big enough" relative to other energy scales in the system?

# fluctuation-dissipation theorem.

Δt = 0.05/abs(J*S) # Integration timestep
λ = 0.2 # Dimensionless damping time-scale
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the motivation for taking λ to 0.2? More reasonable decorrelation times? (I think I used 0.2 for FeI2.)

Copy link
Member Author

@kbarros kbarros Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is problem dependent, but I suspect that larger λ speeds decorrelation in the sense that "gradient descent" becomes more efficient. But in other ways it could slow things, because you're damping out spin waves that may carry information ballistically. Note also that the Heun integrator is not symplectic, and so the "energy conserving" part of the dynamics is not really energy conserving. This means statistical accuracy will degrade when λ gets smaller, but I don't know how strong this effect is in practice.

@@ -46,6 +50,11 @@ for _ in 1:1000
end
plot(energies, color=:blue, figure=(size=(600,300),), axis=(xlabel="Time steps", ylabel="Energy (meV)"))

# After running a Langevin trajectory, it is a good practice to call
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asked about this on slack. It seems reasonable that a trajectory should be collected for statistics. I guess there are some question about whether this implicit definition of a trajectory is the best interface. Will comment more outside of code.

s0 = sys.κs[i] # == norm(s[i])
∇E² = ∇E[i]' * ∇E[i]
accum!(integrator.drift_squared, (1 + (s0*λ)^2) * ∇E²)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you benched the effect of this? Perhaps it should be optional? (I imagine the cost is much less than the calculation of the gradient, but I'm curious about the rough order of magnitude.)

@ddahlbom
Copy link
Member

ddahlbom commented Jan 1, 2024

I think that collecting statistics for a chosen time steps is reasonable, even if in an ideal world a user would like to have a function that takes one step and gets a suggestion. I assume the statistics will vary a lot depending on whether the system is equilibrated or not, and the user should probably be encouraged to do a small study for the situation that they are interested in studying. I think the report that this provides is very useful.

@kbarros
Copy link
Member Author

kbarros commented Jan 2, 2024

I assume the statistics will vary a lot depending on whether the system is equilibrated or not, and the user should probably be encouraged to do a small study for the situation that they are interested in studying. I think the report that this provides is very useful.

A worst case scenario is maybe a single-site problem, where norm(HZ) could be any eigenvalue of the single-ion Hamiltonian, depending on state Z.

Probably for a system with multiple sites, a random spin configuration will give a good sense of the typical dt required.

Conversely, for a local Monte-Carlo sampler, I don't see anyway around the need to collect statistics over a trajectory. Given that this type of interface must exist, it might make sense to use it consistently for Langevin sampling.

@kbarros kbarros force-pushed the integrator_stats branch 2 times, most recently from 9402b91 to 5ea1d65 Compare January 8, 2024 17:25
@kbarros
Copy link
Member Author

kbarros commented Jan 8, 2024

I tested the alternate implementation, which suggests Δt based on the energy gradient of a single snapshot.

Results for the dynamics in our two main tutorials. The suggested Δt below varies according to the type of spin snapshots used in the estimation:

Tutorial03, CoRh2O4

Δt ~ 0.02528 (Polarized in z)
Δt ~ 0.04959 (Randomized spins)
Δt ~ 0.02528 (Energy minimum)
Δt ~ 0.03933 (T=16K equilibrium snapshot)
Δt ~ 0.04201 (T=16K, equilibrium averaged)

Tutorial04, FeI2

Δt ~ 0.1482 (Polarized in z)
Δt ~ 0.05107 (Randomized spins)
Δt ~ 0.02734 (Energy minimum)
Δt ~ 0.03794 (T=3.5meV, equilibrium snapshot)
Δt ~ 0.03885 (T=3.5meV, equilibrium averaged)

Key points:

  1. Energy gradient is larger for configurations at lower temperatures, leading to smaller suggested dt.
  2. At finite temperatures, one should in principle average over many equilibrium snapshots. For these particular systems, with 64 sites, this time averaging isn't overkill, because a single snapshot already includes an average over sites.
  3. The suggested dt at T=0 is approximately 30% lower than at temperatures around than transition.

I think a reasonable advice to users is "first run minimize_energy! and then call suggest_timestep. This is a simple interface, and can be done prior to any time-integration.

@kbarros
Copy link
Member Author

kbarros commented Jan 9, 2024

@ddahlbom pointed out that there might be a different trend with frustrated systems. This can be verified in the SpinW tutorial:

SW08, Kagome antiferro:

Δt ~ 0.025 (Polarized in z)
Δt ~ 0.045 ± 0.01 (Randomized spins)
Δt ~ 0.05 (Energy minimum)

In this case, both energy minimizing and random spin configurations lead to about the same suggested dt, but curiously the "spin polarized" configuration requires dt smaller by a factor of 1/2.

@kbarros kbarros changed the title Collect integraton statistics Suggest integration timestep Jan 9, 2024
Copy link
Member

@ddahlbom ddahlbom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good. It makes it very easy for users.

I wonder if a factor of two or so should be applied to the ImplicitMidpoint recommendation.

Relatedly, it seems like it would be useful to study the number of steps to converge when using ImplicitMidpoint as another reference point. I believe there is a point when the number of steps it takes to converge starts growing quite rapidly, and a good choice is generally to pick a time step as large as possible before that growth kicks in. But that is a more complicated analysis. Whatever is recommended in this version would certainly be safe, I think.

@@ -0,0 +1,27 @@

mutable struct OnlineStatistics{T}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to keep this around?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could later be useful in other places, but I removed the include from Sunny.jl.

end
function suggest_timestep(sys::System{N}, integrator::ImplicitMidpoint; tol) where N
(; Δt) = integrator
suggest_timestep_aux(sys; tol, Δt, λ=0, kT=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to put a factor of 2 in here? That has been the rule of thumb I've generally used for setting the ImplicitMidpoint time step.

# With this increase in temperature, the suggested timestep has increased slightly.

suggest_timestep(sys_large, langevin; tol=1e-2)
langevin.Δt = 0.040;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this interface is good. I guess the pattern is new in Sunny: allowing a field to be left unset in a struct and requiring that it be set before use. But I think I prefer this to, say, having a setter function.


sc = dynamical_correlations(sys_large; Δt=2Δt, nω=120, ωmax=7.5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the two is left here.

From an interface standpoint, I guess it's a bit confusing, because in the typical use case the user never sets a time step for ImplicitMidpoint. But they have the option of estimating if they so choose, and that estimate will be half of what is used here.

@kbarros kbarros merged commit f9dd034 into main Jan 16, 2024
3 checks passed
@kbarros kbarros deleted the integrator_stats branch January 16, 2024 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants