You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
while the timeouts for most calls of HANA binaries and python scripts have been made configurable with commit 7c66a3b , the call to run getParameter.py to get the operation mode still uses a hardcoded timeout of 10 seconds:
For other calls of getParameter.py in the resource agents however the $HANA_CALL_TIMEOUT variable is used to use a configurable timeout.
Is there a specific reason why a hardcoded timeout is used for the getParameter.py call to get the operation mode, or would it be possible to also make the timeout configurable by using the $HANA_CALL_TIMEOUT variable instead?
The text was updated successfully, but these errors were encountered:
The first reason was, that getParamater.py should always answer very fast. Do you have a realistic situation, where getParameter.py did not answer in time? What might be the reason for this? A hanging NFS share? I did not had reviewed that now on code level. My guess is that other then for systemReplicationStatus.py where we have a hard argument to stay with the short timeout, we might change that for the getParameter.py. But we also should take into account that hanging resources could not all be addressed by the SAPHanaSR* resource agents. In special the classic SAPHanaSR resource agents are not independent from the cluster system environment.
Just my first 2ct.
I don't have an actual situation where getParameter.py did not answer in time, I was just asked by some colleagues why there is a hardcoded timeout for the call of getParameter.py in this specific case, whereas for other calls of getParameter.py the configurable timeout is used in the resource agents.
I just have reviewed: https://github.com/SUSE/SAPHanaSR/blob/maintenance-classic/ra/SAPHana#L2664
This is "only" the operation mode if the SR. It needs to be only aquired once* before a register of a former primary is done. So we selected a shorter timeout to prevent to long RA runtimes by adding long timeouts in sequence.
But maybe we should implement the fallback. If getting log-mode is timing-out the function should keep the old value of a query done before.
*) We query the status more than once to get updates, if something would change during the cluster runtime.
Hi,
while the timeouts for most calls of HANA binaries and python scripts have been made configurable with commit 7c66a3b , the call to run getParameter.py to get the operation mode still uses a hardcoded timeout of 10 seconds:
https://github.com/SUSE/SAPHanaSR/blob/maintenance-classic/ra/SAPHana#L2664
For other calls of getParameter.py in the resource agents however the $HANA_CALL_TIMEOUT variable is used to use a configurable timeout.
Is there a specific reason why a hardcoded timeout is used for the getParameter.py call to get the operation mode, or would it be possible to also make the timeout configurable by using the $HANA_CALL_TIMEOUT variable instead?
The text was updated successfully, but these errors were encountered: