-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Conversation
I still need to write some tests for this. I need help in testing this manually, by deploying the Operations contract into the Dev chain and triggering an update there. |
74f0b29
to
114e93a
Compare
3b052ac
to
19083ee
Compare
19083ee
to
56b616e
Compare
7a178f3
to
70fa9ee
Compare
I tested this locally by deploying the |
@@ -245,6 +294,82 @@ impl Updater { | |||
}) | |||
} | |||
|
|||
fn release_block_number(&self, from: BlockNumber, release: &ReleaseInfo) -> Option<BlockNumber> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty big piece of code, can I get a comment about what it does and when to use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has a comment up there in the trait definition:
/// Fetches the block number when the given release was added, checking the interval [from; latest_block].
I can make it more descriptive if you feel it's necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, I wasn't looking back at the trait to compare.
updater/src/updater.rs
Outdated
fs::copy(path, &dest).map_err(|e| format!("Unable to copy update: {:?}", e))?; | ||
restrict_permissions_owner(&dest, false, true).map_err(|e| format!("Unable to update permissions: {}", e))?; | ||
info!(target: "updater", "Copied updated binary to {}", dest.display()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer to that is that if you need more than 3 levels of indentation, you're screwed anyway, and should fix your program.
– Linus Torvalds
No but seriously though, is there a way to break this up somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please exit fast on if let statements, cause it's difficult to read it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I'll clean it up.
updater/src/updater.rs
Outdated
// There was an error fetching the update, apply a backoff delay before retrying | ||
Err(err) => { | ||
let delay = 2usize.pow(retries) as u64; | ||
let backoff = (retries, Instant::now() + Duration::from_secs(delay)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest putting a maximum here, otherwise it goes from 6 days to 12 days to a month etc around 20 retries, meaning we have like ~2 weeks to fix any issue before it takes a month before any node updates. Having a maximum wait of a week seems pretty reasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also probably a cause of #8003 crash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current updater logic will drop the backoff as soon as a new release is pushed, so we won't get stuck in any wait case. I will add a test for that 😉.
UpdaterStatus::Ready { ref release } if *release == latest.track => { | ||
let auto = match self.update_policy.filter { | ||
UpdateFilter::All => true, | ||
UpdateFilter::Critical if release.is_critical /* TODO: or is on a bad fork */ => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentionally or unintentionally left TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was copied from the existing code. I'm not sure how to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes are definitely an improvement, but they still need some polishing
updater/src/updater.rs
Outdated
// Useful environmental stuff. | ||
update_policy: UpdatePolicy, | ||
weak_self: Mutex<Weak<Updater>>, | ||
weak_self: Mutex<Weak<Updater<O, F>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we somehow get rid of this? I know that this will require moving some logic to a separate structure, but weak_self
even sounds terribly :p
updater/src/updater.rs
Outdated
fs::copy(path, &dest).map_err(|e| format!("Unable to copy update: {:?}", e))?; | ||
restrict_permissions_owner(&dest, false, true).map_err(|e| format!("Unable to update permissions: {}", e))?; | ||
info!(target: "updater", "Copied updated binary to {}", dest.display()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please exit fast on if let statements, cause it's difficult to read it :)
"x86_64-apple-darwin".into() | ||
} else if cfg!(windows) { | ||
"x86_64-pc-windows-msvc".into() | ||
} else if cfg!(target_os = "linux") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is going to be backported, we might encounter problems with different openssl versions on different distros. See three different versions of linux packages here for example.
How do we handle that in out current updater? We might've been breaking clients' installations on Debian and CentOS in the worst case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we handle it at all, AFAIK only one Linux release is published in the Operations contract for a given platform. Maybe we can start detecting the distro and treat those as different platforms: "x64-ubuntu-linux-gnu"/"x64-centos-linux-gnu" etc. I think this will require changes in https://github.com/paritytech/push-release and in our CI pipeline.
I haven't looked into how auto-update works but I think users should be prompted with a list of changes for each update (particularly for forks and especially for contentious changes) so that they can choose whether to look into the changes before updating. I expect most users will just update like they would with checking yes without reading terms and conditions, but having the choice to look further into it is important for increasing participation and having healthy governance. I'm sure that @vladzamfir would agree. |
@jamesray1 I appreciate your concerns about informing users on the list of changes for an update, although that is outside the scope of this PR (which is just refactoring the current behavior of the updater). For now users can disable the installation of updates ( |
@andresilva Yes, a GUI app for this kind of update notification would be most appropriate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of grumbles.
updater/src/updater.rs
Outdated
to_block: BlockId::Latest, | ||
address: Some(vec![address]), | ||
topics: vec![ | ||
Some(vec![*RELEASE_ADDED_EVENT_NAME_HASH]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to create this filter manually? Ethabi does not support that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think ethabi
supports this, I'll fix it.
updater/src/updater.rs
Outdated
Err(err) => { | ||
let delay = 2usize.pow(retries) as u64; | ||
// cap maximum backoff to 1 month | ||
let delay = cmp::min(delay, 30 * 24 * 60 * 60); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say a day/week would be enough. To reach 2^21 (month) we already need to spend 48 days waiting for previous delays.
With week (2^19) it's 12 days waiting before reaching the cap
and with a day (2^16) it's 2 days waiting.
(Given my math is not off 😩 )
And I think it shouldn't be an issue to handle one request a week/day from every single node in case something goes wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#hedidthemath
updater/src/updater.rs
Outdated
|
||
let mut state = self.state.lock(); | ||
|
||
if let Some(latest) = state.latest.clone() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if let Some(ref latest) = state.latest.clone()
should be enough, you are cloning it again anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't borrow since we mutate on state
. But I've updated it to avoid cloning latest
again when triggering the fetch. I also removed a similar latest
clone in poll
.
updater/src/updater.rs
Outdated
|
||
state.status = UpdaterStatus::Fetching { release: release.clone(), binary, retries: 1 }; | ||
// will lock self.state | ||
drop(state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it possible to isolate the locks better? Or maybe pass the lock guard to fetch
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually fetch
doesn't lock on self.state
so these are unnecessary. I updated updater_step
and execute_upgrade
to take the MutexGuard
, all explicit drops have been removed.
updater/src/updater.rs
Outdated
state.status = UpdaterStatus::Ready { release: release.clone() }; | ||
// will lock self.state | ||
drop(state); | ||
self.updater_step(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps pass the lock guard to the method instead?
updater/src/updater.rs
Outdated
}, | ||
// we're ready to retry the fetch after we applied a backoff for the previous failure | ||
UpdaterStatus::FetchBackoff { ref release, backoff, binary } if *release == latest.track && self.time_provider.now() >= backoff.1 => { | ||
state.status = UpdaterStatus::Fetching { release: release.clone(), binary, retries: backoff.0 + 1 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we increase the backoff even if latest.track
is now different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If latest.track
changes then we don't enter this branch and instead start downloading the new release (I think there's a test for this).
updater/src/updater.rs
Outdated
|
||
} else if self.update_policy.enable_downloading { | ||
let update_block_number = { | ||
let from = current_block_number.saturating_sub(self.update_policy.max_delay); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the delay be capped with upcoming fork block number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea.
updater/src/updater.rs
Outdated
} | ||
// Only check for updates every n blocks | ||
let current_block_number = self.client.upgrade().map_or(0, |c| c.block_number(BlockId::Latest).unwrap_or(0)); | ||
if current_block_number % cmp::max(self.update_policy.frequency, 1) != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often do we poll
? What's the guarantee that it will ever proceed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently driven by ChainNotify
and we only poll
after we're done syncing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok! I was worried that if it's run by a timer instead it might go into some weird stalled situations where it's aligned with multiplication of block time and the frequency would work incorrectly.
@tomusdrw I think I've addressed all your comments please re-review. |
--auto-update-delay
to randomly delay updates byn
blocks. This takes into account the number of the block of the update release (old updates aren't delayed).--auto-update-check-frequency
to define the periodicity of auto-update checks in number of blocks.Fixes #7272.