-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The "Found an invalid row for peer" error is obscure #303
Comments
The relevant code that checks the "validity" of a python-driver/cassandra/cluster.py Lines 4018 to 4021 in 6b82872
and the warning printed here ( python-driver/cassandra/cluster.py Lines 3949 to 3953 in 6b82872
|
Before this change, when the driver received an invalid system.peers row it would log a very general warning: Found an invalid row for peer (127.0.73.5). Ignoring host. A system.peers row can be invalid for a multitude of reasons and that warning message did not describe the specific reason for the failure. Improve the warning message by adding a specific reason why the row is considered invalid by the driver. The message now also includes the host_id or the entire row (in case the driver received a row without even the basic broadcast_rpc). It might be a bit inelegant to introduce a side effect (logging) to the is_valid_peer static method, however the alternative solution seemed even worse - adding that code to the already big _refresh_node_list_and_token_map. Fixes scylladb#303
Before this change, when the driver received an invalid system.peers row it would log a very general warning: Found an invalid row for peer (127.0.73.5). Ignoring host. A system.peers row can be invalid for a multitude of reasons and that warning message did not describe the specific reason for the failure. Improve the warning message by adding a specific reason why the row is considered invalid by the driver. The message now also includes the host_id or the entire row (in case the driver received a row without even the basic broadcast_rpc). It might be a bit inelegant to introduce a side effect (logging) to the _is_valid_peer static method, however the alternative solution seemed even worse - adding that code to the already big _refresh_node_list_and_token_map. Fixes scylladb#303
I addressed "why the the row in system.peers was considered as invalid" part in #305, however not necessarily the "and on which node it was read from" part but that can be deduced from other logs. |
thanks. hopefully it will be enough. |
I'm seeing these error in a dtest that adds 2 new nodes (in parallel) to a 3-nodes cluster.
It looks like the error heals itself eventually but it's unclear why the the row in system.peers was considered as invalid, and on which node it was read from.
It would help if we printed more information to facilitate debugging in case something is wrong in the scylla server side.
The text was updated successfully, but these errors were encountered: