-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent deletion of indices and master failure can cause indices to be reimported #11665
Comments
@brwe what about making the delete index request wait for responses from the data nodes? then the request can report success/failure? |
@clintongormley the delete index API does wait for data nodes to confirm the deletion. The above scenario will trigger the call to time out (it waits for an ack from the data node that will not come). If people then check the CS, they will see that the index was deleted. However, at a later stage, once the data rejoins the cluster and the new master, the index will be reimported. |
Ok understood. +1 |
Some of the test for meta data are redundant. Also, since they somewhat test service disruptions (start master with empty data folder) we might move them to DiscoveryWithServiceDisruptionsTests. Also, this commit adds a test for elastic#11665
Some of the test for meta data are redundant. Also, since they somewhat test service disruptions (start master with empty data folder) we might move them to DiscoveryWithServiceDisruptionsTests. Also, this commit adds a test for elastic#11665
@bleskes is this still an issue? |
Sadly it is. However, thinking about it again I realized that we can easily detect the “new empty” master danger by comparing cluster uuid - a new master will generate a new one. Agreed with marking as adopt me. Although it sounds scary it’s quite an easy fix and is a good entry point to the cluster state universe. If anyone wants to pick this up, please ping me :)
|
If a node was isolated from the cluster while a delete was happening, the node will ignore the deleted operation when rejoining as we couldn't detect whether the new master genuinely deleted the indices or it is a new fresh "reset" master that was started without the old data folder. We can now be smarter and detect these reset masters and actually delete the indices on the node if its not the case of a reset master. Note that this new protection doesn't hold if the node was shut down. In that case it's indices will still be imported as dangling indices. Closes elastic#11665
Currently, a data node deletes indices by evaluating the cluster state. If a new cluster state comes in it is compared to the last known cluster state, and if the new state does not contain an index that the node has in its last cluster state, then this index is deleted.
This could cause data to be deleted if the data folder of all master nodes was lost (#8823):
All master nodes of a cluster go down at the same time and their data folders cannot be recovered.
A new master is brought up but it does not have any indices in its cluster state because the data was lost.
Because all other node are data nodes it cannot get the cluster state from them too and therefore sends a cluster state without any indices in it to the data nodes. The data nodes then delete all their data.
On the master branch we prevent this now by checking if the current cluster state comes from a different master than the previous one and if so, we keep the indices and import them as dangling (see #9952, ClusterChangedEvent).
While this prevents the deletion, it also means that we might in other cases not delete indices although we should.
Example:
Currently there is no way for a data node to decide if an index should actually be deleted or not if the cluster state that triggers the delete comes from a new master. We chose between: (1) deleting all data in case a node receives an empty cluster state or (2) run the risk to keep indices around that should actually be deleted.
We decided for (2) in #9952. Just opening this issue so that this behavior is documented.
The text was updated successfully, but these errors were encountered: