When upgrad or reload a cluster, increase retry when accessing PD #2494

wsiqiang6 · 2025-01-17T09:06:35Z

Bug Report

What did you do?
tiup cluster upgrade <clsuter_name>

In the TiKV evict leader phase :
error requesting pd api , response: no leader

What did you expect to see?

After investigation, it was found that due to the leader priority setting in PD, a leader switch occurred during the "upgrade cluster" pd stage. Subsequently, PD checked the leader priority every minute, causing a PD leader transfer that took 0.5 seconds.

Coincidentally, during this 0.5-second window, the upgrade cluster process had already reached the TiKV stage and was performing the "set leader evict scheduler" operation, resulting in a "no leader" error when accessing PD, which caused TiUP to exit.

I think a retry mechanism should be added when calling the PD API to prevent TiUP upgrade or reload operations from being interrupted due to such short-term changes in PD.

What did you see instead?
tiup error exits
What version of TiUP are you using (tiup --version)?
v1.14.0

The text was updated successfully, but these errors were encountered:

wsiqiang6 added the type/bug Categorizes issue as related to a bug. label Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When upgrad or reload a cluster, increase retry when accessing PD #2494

When upgrad or reload a cluster, increase retry when accessing PD #2494

wsiqiang6 commented Jan 17, 2025

When upgrad or reload a cluster, increase retry when accessing PD #2494

When upgrad or reload a cluster, increase retry when accessing PD #2494

Comments

wsiqiang6 commented Jan 17, 2025

Bug Report