Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computational backend: Use S3 Envs for AWS clusters that live inside the simcore stack #4640

Closed
1 task done
Tracked by #617
sanderegg opened this issue Aug 22, 2023 · 0 comments
Closed
1 task done
Tracked by #617
Assignees
Labels
a:dask-service Any of the dask services: dask-scheduler/sidecar or worker

Comments

@sanderegg
Copy link
Member

sanderegg commented Aug 22, 2023

For the use-case of having separate clusters within the same AWS system it is not needed to fallback to pre-signed upload links since the cluster is still internal to osparc-simcore. That would solve the issue of uploading files >5GB.

BUT:

  • these links are still created with an expiration time (this is S3 policy which makes sense). Therefore currently the dv2 creates these upload links with an expiration time. If the computational service fails to complete within that time, then the upload link becomes invalid. (NOTE: this also happens on the default cluster)
  • A quick & dirty fix is to set that expiration time to an insane value,
  • A better one is to make the dask-sidecar use the osparc PublicAPI which now allows to upload files >5GB, that would require passing API key/secret to the dask-sidecar with the rights to upload, and then upload to the correct location (is it actually possible??) /projects_id/node_id/output... should not go in the public API
  • After discussion, a more sustainable version of this is to have a separate service (clusters-keeper? or another) that provides an entrypoint solely for computational workers to access upload links. Then an authentication per worker would even be possible.

Baklava

Preview Give feedback
  1. a:director-v2
    sanderegg
@sanderegg sanderegg transferred this issue from ITISFoundation/osparc-issues Aug 22, 2023
@sanderegg sanderegg self-assigned this Aug 22, 2023
@sanderegg sanderegg added the a:dask-service Any of the dask services: dask-scheduler/sidecar or worker label Aug 22, 2023
@sanderegg sanderegg changed the title Use S3 Envs for external internal clusters to go over 5Gb Computational backend: Use S3 Envs for AWS clusters that live inside the simcore stack Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:dask-service Any of the dask services: dask-scheduler/sidecar or worker
Projects
None yet
Development

No branches or pull requests

3 participants