Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Delete data from remote store (translog and segments) on index delete #3511

Closed
sachinpkale opened this issue Jun 6, 2022 · 3 comments · Fixed by #7682
Closed
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Durability Issues and PRs related to the durability framework

Comments

@sachinpkale
Copy link
Member

Describe the solution you'd like
When an index is deleted, corresponding translog and segment data should be deleted from remote translog and remote segment store.

@andrross
Copy link
Member

andrross commented Jun 13, 2023

@gbbafna @sachinpkale Are we sure this is the right experience? With remote store users get the "durable by default" behavior and I don't know if it is intuitive that deleting the index also deletes the data from remote. If I understand correctly, if we implement the behavior in #7682, then the experience is as follows (assuming I have a remote-backed index named my-index with data in it):

Use case 1: Permanently delete all data (local and remote) in my-index:

DELETE my-index

Use case 2: Remove index data from my cluster, but keep a copy in remote:

PUT _snapshot/remote-repo/my-snapshot
{
  "indices": "my-index"
}

DELETE my-index

I like how that behavior reuses the existing snapshot functionality for managing backups and restoring them. However, I just want to make sure that we're not building a trap that users will assume their remote-backed index is durable and safe, only to be surprised that deleting the index also deleted all data in remote.

@sachinpkale
Copy link
Member Author

@andrross As part of #6483, we are introducing interoperability with snapshot. Snapshot interoperability will allow us to create a snapshot referencing data in the remote store. It is also introducing a concept of ref count. With this, a segment in remote store can only be deleted if it is not referenced by any snapshot. This will hold true for index deletion as well.

@andrross
Copy link
Member

andrross commented Jun 15, 2023

@samuel-oci FYI, with this change, the data will be deleted from the remote repository upon index deletion. Snapshot interoperability is the mechanism for keeping the data in order to restore at a later point (or to a different cluster, etc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Durability Issues and PRs related to the durability framework
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants