-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Remote Compaction] Update APIs to support generic unique identifier format #12384
Conversation
@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@jaykorean has updated the pull request. You must reimport the pull request before landing. |
@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
c39da5a
to
1c0563d
Compare
@jaykorean has updated the pull request. You must reimport the pull request before landing. |
@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@@ -428,6 +428,17 @@ struct CompactionServiceJobInfo { | |||
priority(priority_) {} | |||
}; | |||
|
|||
struct CompactionServiceScheduleResponse { | |||
std::string scheduled_job_id; // Generated outside of primary host, unique | |||
// across different DBs and sessions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"...across different DBs, sessions and jobs"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Job Id itself within CompactionServiceJobInfo
is not unique globally.
rocksdb/include/rocksdb/options.h
Lines 414 to 417 in 99cc36b
uint64_t job_id; // job_id is only unique within the current DB and session, | |
// restart DB will reset the job_id. `db_id` and | |
// `db_session_id` could help you build unique id across | |
// different DBs and sessions. |
If users are leveraging centralized queue service externally to offload compactions to remote workers, globally unique identifier is necessary (basically unique across different DBs, sessions and jobs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
1c0563d
to
30dbc41
Compare
@jaykorean has updated the pull request. You must reimport the pull request before landing. |
@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
30dbc41
to
c11011d
Compare
@jaykorean has updated the pull request. You must reimport the pull request before landing. |
@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@jaykorean merged this pull request in 5bcc184. |
Summary
The current design proposes using a combination of
job_id
,db_id
, anddb_session_id
to create a unique identifier for remote compaction jobs. However, this approach may not be suitable for users who prefer a different format for the unique identifier.At Meta, we are utilizing generic compute offload to offload compaction tasks to remote workers. The compute offload client generates a UUID for each task, which requires an update to the current RocksDB API for onboarding purposes.
Users still have the option to create the unique identifier by combining
job_id
,db_id
, anddb_session_id
if they prefer.Test Plan