Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When upgrad or reload a cluster, increase retry when accessing PD #2494

Open
wsiqiang6 opened this issue Jan 17, 2025 · 0 comments
Open

When upgrad or reload a cluster, increase retry when accessing PD #2494

wsiqiang6 opened this issue Jan 17, 2025 · 0 comments
Labels
type/bug Categorizes issue as related to a bug.

Comments

@wsiqiang6
Copy link

Bug Report

  1. What did you do?
    tiup cluster upgrade <clsuter_name>

In the TiKV evict leader phase :
error requesting pd api , response: no leader

  1. What did you expect to see?

After investigation, it was found that due to the leader priority setting in PD, a leader switch occurred during the "upgrade cluster" pd stage. Subsequently, PD checked the leader priority every minute, causing a PD leader transfer that took 0.5 seconds.

Coincidentally, during this 0.5-second window, the upgrade cluster process had already reached the TiKV stage and was performing the "set leader evict scheduler" operation, resulting in a "no leader" error when accessing PD, which caused TiUP to exit.

I think a retry mechanism should be added when calling the PD API to prevent TiUP upgrade or reload operations from being interrupted due to such short-term changes in PD.

  1. What did you see instead?
    tiup error exits

  2. What version of TiUP are you using (tiup --version)?
    v1.14.0

@wsiqiang6 wsiqiang6 added the type/bug Categorizes issue as related to a bug. label Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Categorizes issue as related to a bug.
Projects
None yet
Development

No branches or pull requests

1 participant