-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
retry upload if it failed? #9696
Comments
@alejandratenorio hi! We don't have a great solution for restarting file upload. We did add support for rsync but we're probably going to remove it or at least deprecate it: Do you happen to store your files on S3? I'm asking because there's a feature we call S3 direct upload where the files travel from the user's computer directly to S3 instead of passing through Dataverse. |
Hi @pdurbin, Thanks for your response. Yes, we store on S3 and have enabled S3 direct upload. We upload many files simultaneously per dataset and usually don't have problems. But when our network goes down, we have to restart the upload manually. We will be upgrading our Dataverse soon, We have an increasing need to upload more and more files per dataset and would use rsync as a solution to upload it, but if it were to be removed, what other tool could we use to facilitate file uploads? Thanks, |
Hmm. Another option might be Globus. I checked with the team and @qqmyers had this to say (thanks, Jim): "Globus does do retries, not sure when it does partial retries (not resending bytes that made it)." Here's a handy link to the docs: https://guides.dataverse.org/en/5.13/developers/big-data-support.html#globus-file-transfer Another workaround might be to keep the files as zips. But there are tradeoffs. 😬 @ErykKul was recently talking about a dataset with thousands of files in #9558. Maybe he has some thoughts. You could also ask at https://groups.google.com/g/dataverse-community of course! 😄 |
FWIW: If the issue is failures where whole files have been uploaded (and not partial files), a tool like the DVUploader might help - it can be run repeatedly to upload n files at time (versus trying to upload all files in a long list at once). If that works, it may not be too hard to add a similar limit in the UI direct upload and dvwebloader plugin. (Those both try to register all files with Dataverse at once since that is most efficient, but they could be changed to push every n files. This could raise the issue of trying to explain partial successes in the UI - the dvwebloader might be better there since it already can detect and show when files on your disk already exist in the dataset.) In any case, those types of changes would require some programming work, but the DVUploader could be scripted now/with the current Dataverse release, etc. |
Dear Dataverse Support,
What steps does it take to reproduce the issue?
You start uploading a large file, then your network goes down for a short time.
When does this issue occur?
When your network goes down
Which page(s) does it occurs on?
Upload with HTTP via your browser
What happens?
Your upload fails and you need to start over.
To whom does it occur (all users, curators, superusers)?
All users
What did you expect to happen?
Is it possible to set up an upload retry like other File Transfer Tools such as SFTP or Filezilla?
Which version of Dataverse are you using?**
5.10
Any related open or closed issues to this bug report?**
No matter the issue, screenshots are always welcome.
To add a screenshot, please use one of the following formats and/or methods described here:
The text was updated successfully, but these errors were encountered: