-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change dataset
table to only hold Crucible datasets
#7386
base: main
Are you sure you want to change the base?
Conversation
|
||
/// Database representation of a Crucible dataset's live information. | ||
/// | ||
/// This includes the socket address of the Crucible downstairs that owns this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/downstairs/agent/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in deeec80
} | ||
|
||
#[derive(Debug, Default, Serialize, Deserialize)] | ||
pub struct CrucibleResourcesV2 { | ||
pub datasets_and_regions: Vec<(Dataset, Region)>, | ||
pub datasets_and_regions: Vec<(CrucibleDataset, Region)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to be able to deserialize these existing blobs: will the JSON format change with the type change here? I don't think so but I've been wrong before!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does change but only in ways that should be compatible, I think:
- The old JSON blobs will have
zone_name
,quota
,reservation
, andcompression
fields that we'll ignore when deserializing. - The old JSON blobs could have had
null
for theip
,port
, orsize_used
fields. If they did, we will now fail to deserialize them. But those should have always been set for Crucible datasets, so should not havenull
?
I was planning to take this through a "deploy R12 on a racklette, upgrade to this branch" test before merging. (Although I'd like to wait to do that until I have the next, smaller PR up, to test both at the same time.) What should I do to the racklette before upgrading to ensure I have some of these JSON blobs present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh, maybe a better check: where is this JSON in CRDB? I can check dogfood and see if we have any that would cause problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm well that didn't work. Assuming this is volume.resources_to_clean_up
, dogfood only has empty resources for V1 and V2, and V3 looks like it only has IDs in it?
root@[fd00:1122:3344:105::3]:32221/omicron> select resources_to_clean_up from volume where resources_to_clean_up is not null and resources_to_clean_up like '{"V2%';
resources_to_clean_up
---------------------------------------------------------------
{"V2":{"datasets_and_regions":[],"snapshots_to_delete":[]}}
{"V2":{"datasets_and_regions":[],"snapshots_to_delete":[]}}
(2 rows)
Time: 12ms total (execution 12ms / network 0ms)
root@[fd00:1122:3344:105::3]:32221/omicron> select resources_to_clean_up from volume where resources_to_clean_up is not null and resources_to_clean_up like '{"V1%';
resources_to_clean_up
------------------------------------------------------------------
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
{"V1":{"datasets_and_regions":[],"datasets_and_snapshots":[]}}
(13 rows)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, unfortunately in this case volume records are eventually hard deleted.
What should I do to the racklette before upgrading to ensure I have some of these JSON blobs present?
Actually, I think test_deserialize_old_crucible_resources
covers this case!
let dataset = dataset?; | ||
let id = dataset.id(); | ||
|
||
// If this dataset already exists, do nothing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a behaviour change from what was here before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make the claim it is not, even though it looks like it is. 😅
Before, we had this:
// If this dataset already exists, only update it if it appears different from what exists
// in the database already.
let action = if let Some(db_dataset) = existing_datasets.remove(&id) {
let db_config: DatasetConfig = db_dataset.try_into()?;
let bp_config: DatasetConfig = bp_dataset.clone().try_into()?;
if db_config == bp_config {
num_unchanged += 1;
continue;
}
When flattening a db_dataset
down to a DatasetConfig
to see if there are any relevant changes, we were throwing away ip
, port
, and size_used
, only leaving us with information derived from pool_id
, zone_name
, quota
, reservation
, and compression
:
impl TryFrom<Dataset> for omicron_common::disk::DatasetConfig {
type Error = Error;
fn try_from(dataset: Dataset) -> Result<Self, Self::Error> {
let compression = if let Some(c) = dataset.compression {
c.parse().map_err(|e: anyhow::Error| {
Error::internal_error(&e.to_string())
})?
} else {
omicron_common::disk::CompressionAlgorithm::Off
};
Ok(Self {
id: dataset.identity.id.into(),
name: omicron_common::disk::DatasetName::new(
omicron_common::zpool_name::ZpoolName::new_external(
ZpoolUuid::from_untyped_uuid(dataset.pool_id),
),
dataset.kind.try_into_api(dataset.zone_name)?,
),
inner: omicron_common::disk::SharedDatasetConfig {
quota: dataset.quota.map(|q| q.into()),
reservation: dataset.reservation.map(|r| r.into()),
compression,
},
})
}
}
We dropped all the those fields we were comparing except pool_id
. So the equivalent code here would be something like:
if existing_dataset.pool_id == dataset.pool_id {
num_unchanged += 1;
continue;
}
but I don't believe it's possible for a dataset's pool ID to change, right? So I dropped what would look like a weird / spurious check and replaced it with just the "have we inserted this already" check that's here now.
If it is possible for a dataset's pool ID to change, the new code is a behavior change and is wrong.
A separate question is: is it possible for a dataset's IP or port to change? If so, this code is wrong, but so was the old code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We dropped all the those fields we were comparing except pool_id.
I don't follow this part.
I don't believe it's possible for a dataset's pool ID to change, right?
I believe the same thing haha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We dropped all the those fields we were comparing except pool_id.
I don't follow this part.
The old code for deciding whether to try to upsert was this:
// If this dataset already exists, only update it if it appears different from what exists
// in the database already.
let action = if let Some(db_dataset) = existing_datasets.remove(&id) {
let db_config: DatasetConfig = db_dataset.try_into()?;
let bp_config: DatasetConfig = bp_dataset.clone().try_into()?;
if db_config == bp_config {
num_unchanged += 1;
continue;
}
Those DatasetConfig
structs are only comparing a subset of fields: pool_id
, zone_name
, quota
, reservation
, and compression
. Of all those fields, the new CrucibleDataset
only has pool_id
. So the behaviorally-equivalent code would be to just compare "does the blueprint dataset pool_id
match the database dataset pool_id
", but that's a silly check to make when we know it can't change.
if !matches!(zone.zone_type, BlueprintZoneType::Crucible(_)) { | ||
return None; | ||
} | ||
let dataset = zone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the type here CrucibleDataset
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's DurableDataset
, which contains a reference to an OmicronZoneDataset
:
omicron/nexus/types/src/deployment/zone_type.rs
Lines 237 to 241 in db656d7
pub struct DurableDataset<'a> { | |
pub dataset: &'a OmicronZoneDataset, | |
pub kind: DatasetKind, | |
pub address: SocketAddrV6, | |
} |
@@ -400,16 +376,17 @@ mod tests { | |||
.expect("failed to upsert zpool"); | |||
} | |||
|
|||
// Call `ensure_dataset_records_exist` again, adding new datasets. | |||
// Call `ensure_crucible_dataset_records_exist` again, adding new | |||
// datasets. | |||
// | |||
// It should only insert these new zones. | |||
let new_zones = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong, though I haven't looked deeply into this test: it's inserting two of the same zone that are different types than before this diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is okay, but please double check me. Before, the execution code would insert dataset records for any DatasetKind
since dataset
held all dataset kinds; now, it only inserts records for Crucible datasets. The test is trying to ensure that new datasets added after the last execution are added, which is now restricted to "new Crucible datasets added after the last execution are added", so the new zones we use to test have to be Crucible zones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before, the execution code would insert dataset records for any DatasetKind since dataset held all dataset kinds; now, it only inserts records for Crucible datasets.
I think this is why I was suspicious: based on that understanding this vec should have had four members, mixing together the crucible and non-crucible datasets.
There should be two versions of this test: one for "durable" datasets and one for crucible datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be missing something - does it help if all_datasets
already included Crucible datasets? (Guaranteed by https://github.com/oxidecomputer/omicron/pull/7386/files/26ccf5f2e0f22721cd4d1d26bac4d9c4dc659e6b#diff-d4165f273b361528feca0c934574df6365cd86f5688ab81665f35975620183a1R312-R316, and then we pass in all_datasets.iter().chain(new_zones)
; that used to provide a mix of crucible and non-crucible, and now just passes in "more crucible".)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We chatted about what made this test confusing; I took a stab at addressing it in 1201f9a
schema/crdb/dbinit.sql
Outdated
kind omicron.public.dataset_kind NOT NULL, | ||
/* | ||
* Contact information for the dataset: socket address of the Crucible | ||
* downstairs service that owns this dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/downstairs/agent/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in deeec80
set local disallow_full_table_scans = off; | ||
|
||
DELETE FROM omicron.public.crucible_dataset WHERE kind != 'crucible'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to ask if we wanted to add a constraint that the kind must be 'crucible' before dropping, to ensure we've cleared other dataset types, but I hesitated to recommend that because "blocking the update" also seems bad.
To confirm, if there happen to be non-crucible entries for any reason, we're okay dropping them quietly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(kind != 'zone') OR | ||
(kind = 'zone' AND zone_name IS NOT NULL) | ||
) | ||
size_used INT NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not your fault that this isn't labelled, but could we add: "this field is controlled by Nexus, and is updated as region allocations are performed from this dataset"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in deeec80
@@ -557,9 +557,10 @@ CREATE TYPE IF NOT EXISTS omicron.public.dataset_kind AS ENUM ( | |||
); | |||
|
|||
/* | |||
* A dataset of allocated space within a zpool. | |||
* Table tracking the contact information and size used by datasets associated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Is it worth documenting that this is a rendezvous table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure - added in deeec80
ip -> Inet, | ||
port -> Int4, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, thanks for making these non-nullable, makes this a lot easier to use
} | ||
|
||
pub async fn dataset_upsert_if_blueprint_is_current_target( | ||
pub async fn crucible_dataset_upsert_if_blueprint_is_current_target( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(no action needed) since this is invoked by the blueprint executor to ensure dataset records exist, I assume this can also go away when we move more fully to the rendezvous table model.
(all to say, thanks for this PR, this is a great direction)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's the plan for the next PR 👍
} | ||
|
||
datastore | ||
.dataset_delete_if_blueprint_is_current_target( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rad to see this unified, it felt gross as it was written (deletion of a table row happening in multiple possible spots). Thanks again for this change.
Notes from testing this PR along with its successor #7397 on |
The diff of this PR is large but very mechanical. The actual changes here are:
dataset
table has been renamedcrucible_dataset
(and correspondingly, theDataset
model struct is nowCrucibleDataset
, andDatastore::dataset_*
methods are nowDatastore::crucible_dataset_*
)zone_name
column has been dropped (it was always NULL for Crucible datasets, enforced by aCHECK
constraint)quota
,reservation
, andcompression
columns have been dropped (they were always NULL for Crucible datasets, although this was not enforced by aCHECK
constraint; this information is kept in the blueprint and is not read from this table)ip
,port
, andsize_used
columns now haveNOT NULL
constraints (they already hadNOT NULL
constraints specifically for Crucible datasets via aCHECK
constraint; correspondingly, theCrucibleDataset
model type has non-optional fields for these columns)and everything else is followup to account for this. Most of that followup is just renaming things. A few nontrivial exceptions:
address
by failing if it wasn't set for Crucible datasets; that error handling is now gone sinceaddress
is no longer optional.dataset
table using a mix of Crucible and non-Crucible datasets. I tried to update these as reasonably as I could while keeping the spirit of the test in tact, but a closer look at these bits would be good.I think this gets us most of the way toward
crucible_dataset
being a "rendezvous table" in the RFD 540 sense. As of this PR, the blueprint executor task is still responsible for adding entries to this table, but I'll move that to the rendezvous background task in a followup.Fixes #7299. Gets us halfway to #7301 by fixing the
dataset
side of that issue (bp_omicron_dataset
still needs work).