-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to handle watch retry #8914
Comments
/cc @mitake can you take a look? |
@cnljf could you share more detailed information about your deployments? How many nodes are in your cluster? What token type are you using (simple or JWT)? |
@cnljf could you show every parameter for creating your client? Do the parameters have every endpoint? Also, does this problem happen if you disable auth? |
I has two cluster: This problem occur in my beta and online environment. But when I simulate this problem in my desktop,it only occur once. Here is some dependency I use, hope can help: [[projects]] [[projects]] [[projects]] [[projects]] [[projects]] |
Could you try passing all endpoints to BTW do you mean that you don't see this problem even if you use a single endpoint to client creation in 3.1.5? |
When I create the client, I call clientv3.Sync(). Do it equal to create client with all endpoints? |
Yes
Do you mean the problem is non deterministic and hard to reproduce? |
Yes, it's hard to reproduce. In addition, If you need some information, I can try to reproduce from beta environment. |
I see. If you get any new additional information related to it, could you share? I'm still trying to reproduce the problem. |
Can you indicate some information you may need? I can have a target information to fetch. |
I want logs of all nodes if it is possible. If you can provide more detailed information about your client behaviour, it is nice. |
I just shutdown one node in beta env, the client can't watch normally. 2017-11-30 14:01:23.204399 W | rafthttp: lost the TCP streaming connection with peer 4c3613973f03af18 (stream Message reader) 2017-11-30 14:01:23.377820 E | rafthttp: failed to dial 4c3613973f03af18 on stream Message (dial tcp 192.168.1.3:2380: getsockopt: connection refused) 2017-11-30 14:04:17.832067 W | rafthttp: health check for peer 4c3613973f03af18 could not connect: dial tcp 192.168.1.3:2380: getsockopt: connection refused The log “auth: invalid auth token” repeat frequently. Because client is trying rewatch. |
Thanks for reporting the detail. It seems that reauth mechanism for generating a new token wouldn't be working well in the case of watch failure. I'll dig this problem. BTW if you use CN based auth or jwt token, you can avoid this problem as an easy work around. |
Thank you. I will try your advise. |
Is there jwt token doc or demo? |
You can find it here: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/configuration.md#auth-flags |
hi, I try to use jwt token, but it doesn't work. here is the cmd line: |
hmm, it's strange. Could you share your 2391.yml? |
I rename 2391.yml to 2391.txt. Because git dosen't support yml. |
hmm, does every member share the same configuration and command line options? |
yes. |
Hi, I have tried again. Not successful. |
At least the node which produced the log is using simple token. It is strange. Could you provide log files of every node? |
@cnljf I tried to reproduce your problem but it wasn't possible. Your cluster is using simple token instead of jwt. Could you check the |
etcd Version: 3.2.9 |
I tested both of the version v3.2.9 and the master branch: c8dc19b. Here is my Procfile:
|
How do you generate pub-key and priv-key? Use ssh-keygen? |
I used openssl command. You can find already generated files for the testing purpose in the repository (paths can be found in my Procfile). |
Hi, I can use jwt successfully. Here is the fail reason:
|
Watch retry is normal after use jwt. |
Thanks for trying and good to know jwt is working well for your purpose. I'll work on simple token issue. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
hi:
My env is etcd server:3.2.9 and go client:3.2.9. Use user password auth.
I meet a problem. When I restart a etcd server,the watch channel closed,and I retry to watch without create a new client,the watch cannot success。Here is code:
Here is tcpdump result:
The long link is alive,but there is PermissionDenied Error。
The text was updated successfully, but these errors were encountered: