-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative for managing task array status in Google Batch #5723
Alternative for managing task array status in Google Batch #5723
Conversation
34ced37
to
5f398ea
Compare
Bit lost, is there any relation between this PR and #5690 ? |
Yes, in #5690 we revert to the old job status management when task is not an array, but we kept the same for task arrays. In this PR, I also changed the way how task array statuses are queried to use the listTasks instead of getTaskStatus. This also avoids the NotFoundException because we do not call the problematic getTaskStatus. |
Is this ready for review? |
Signed-off-by: jorgee <[email protected]>
Signed-off-by: jorgee <[email protected]>
03e610b
to
070932b
Compare
Signed-off-by: jorgee <[email protected]>
I have rebased with latest changes from the parent branch. It is ready |
plugins/nf-google/src/main/nextflow/cloud/google/batch/client/BatchClient.groovy
Outdated
Show resolved
Hide resolved
plugins/nf-google/src/main/nextflow/cloud/google/batch/client/BatchClient.groovy
Outdated
Show resolved
Hide resolved
|
||
private void updateArrayTasks(String jobId, long now){ | ||
for( Task t: listTasks(jobId) ){ | ||
arrayTaskStatus.put(t.name, new TaskStatusRecord(t.status, now)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there can be any race condition on accessing the arrayTaskStatus
map?
…BatchClient.groovy [ci skip] Signed-off-by: Paolo Di Tommaso <[email protected]>
…BatchClient.groovy [ci skip] Signed-off-by: Paolo Di Tommaso <[email protected]>
Moving fwd this |
195ba32
into
5422-not_found-error-on-google-batch
This PR is an alternative to process task status in task arrays.
In the current status, every time that getTaskState is called, we perform two calls to the Google Batch client, one to retrieve the list of tasks(
listTasks
) and another to get the task state (getTaskStatus
). This second call is problematic because it generates the NotFoundException and it is also redundant because the first call already provides the task descriptions including their states.The BatchClient has a HashMap as a cache of array task status. A new method is created to retrieve the status of a task belonging to a task array. Instead of making the call to the Google Batch API, it checks the status in the cache. When the status is not in the cache or is outdated, the
listTasks
method is called to update all the array tasks statuses. So, the rest of the array tasks do not require querying the Google Batch API again.When there is no task status, it returns null and fallbacks to the
getJobStatus.
This is the same as we were doing when no tasks were retrieved from the job or there was aNotFoundException
.The invalidation time is 1 second because it is the same as the one in
GoogleBatchTaskHandler
. Another alternative is setting it with the same value as the polling interval.