-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add early rejection of IOs if too many Downstairs are inactive #1565
base: main
Are you sure you want to change the base?
Conversation
Some random musings around this. If we send the write to one downstairs, it could land on disk, but the IO will sit waiting for another downstairs to write it before we could ACK back to the guest. Any flush is going to stop all future IOs, as they will all build up behind it. This is actually a reason to fast fail writes/flushes as I believe that would enable reads to still work, right? What should we do with in-flight IOs? Do we now want to discard them as well? If we do fail outstanding IOs, outstanding writes we would have already fast-acked, but flushes will now go to failed. By doing this, are we cutting off a chance for a replay to catch things up? |
No, writes are either fast-acked to the guest or (after changes made in this PR) fast-errored. In the latter case, they aren't sent to any Downstairs.
Yes, with caveats – if the Guest is waiting for a flush, then sending a read, it would previously stall if 2x Downstairs are offline. Now, the flush will return an error immediately, and the read can proceed (or not, depending on how the Guest handles error codes).
This PR hasn't changed any behavior here. If a Downstairs goes to the Faulted state, all of its IOs are skipped, which may cause them to be acked to the Guest.
No, any IO rejected here is never added to the active jobs list, so it's never sent to any downstairs. It's rejected before being assigned a |
Ah yes, fast ack will ack while it thinks things are good. If one downstairs is online, it could get the data and the other two would not, but we have already acked it back to the guest.
The transition to faulted here, that's what I think I needed to see, so that will skip IO to any downstairs in
So, your right that a replay does not happen here. I'm thinking about |
upstairs/src/downstairs.rs
Outdated
self.clients | ||
.iter() | ||
.filter(|c| { | ||
matches!(c.state(), DsState::Active | DsState::LiveRepair) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to allow additional states here. Offliine
for instance is one where we have not yet decided what the situation is, and it could just be a downstairs rebooting and soon enough (fingers crossed) it will come back.
There may be other transitory states to consider as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this goes in before 1568/1570, then we want the updated list of DsState
's
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's going to be much easier to rebase this onto #1577 then vice versa, so I think we should get those three merged before thinking about this further!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At long last, #1577 is merged and this is rebased. I've moved this match
into a standalone function (DownstairsClient::is_accepting_io
) and added DsState::Connecting { mode: ConnectionMode::Offline, .. }
as a valid case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this change does what it intends to do, and I am pretty sure it's the right thing to do broadly, but I'm deferring to alan's comments on the details
775749f
to
62229ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this block internal flushes for read only upstairs?
async fn read_with_one_fault() { | ||
let mut harness = TestHarness::new().await; | ||
|
||
// Use a write to fault DS0 (XXX why do read errors not fault a DS?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a write failure, we don't know the state (from an upstairs POV) of what actually made it on to disk. As such, we can't know that this downstairs is the same as the other two downstairs. So we have to fault the downstairs with the failing IO.
For a read error, we know there is an error, but we don't know that this downstairs is different than the other downstairs from the upstairs point of view. At least not without more information. If we get a read error, that is "good" in that we are not returning bad data to the guest.
@leftwo I don't think so – internal flushes call |
33c34e9
to
dfc7697
Compare
dfc7697
to
745927e
Compare
745927e
to
dd1cf13
Compare
This aborts the IO before passing it to the Downstairs, so it's not assigned a
JobId
or put into theActiveJobs
map. The most noticeable change is that writes are now fast-err'd instead of fast-acked if > 1 Downstairs is inactive.