-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace ZipInputStream with ZipFile #10899
Conversation
src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateNewDataFilesCommand.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I'd suggest adding a one-line release note, e.g. Unzip during upload now supports more variations of the zip format, including the zip files generated by ownCloud.
Test is failing: execute_rezips_sets_of_shape_files_from_uploaded_zip – edu.harvard.iq.dataverse.engine.command.impl.CreateNewDataFilesTest |
*/ | ||
var nrOfZipFiles = 20; | ||
var avgNrOfFilesPerZip = 300; | ||
var avgFileLength = 5000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Results of this relatively small stress test executed with the old an new implementation:
ZipFile :: Total time: 150,909ms; nr of zips 20 total nr of files 6,404; total file size 32,030,650
ZipInputStream:: Total time: 148,432ms; nr of zips 20 total nr of files 6,211; total file size 31,570,383
# Conflicts: # src/main/java/edu/harvard/iq/dataverse/util/ShapefileHandler.java
Intellij shows directory labels like ewDataFilesTest/tmp/temp/shp_2024-10-22-01-57-21-833/dataDir/extra possibly different environments have different values
Fix looks good. |
What this PR does / why we need it:
It will no longer fail to upload individual files from a zip downloaded from an ownCloud service.
See issue for further details.
Which issue(s) this PR closes:
Special notes for your reviewer:
You will get less differences when ignoring white space changes:
https://github.com/IQSS/dataverse/compare/develop...DANS-KNAW-jp:dataverse:10898-own-cloud-zips?w=1
Replaced ZipInputStream with ZipFile in:
Additional
CreateNewDataFilesCommand
: extracted methodsfilteredZipEntries
,openZipFile
,getShortName
andisFileToSkip
The extracted code is slightly different from the ShapeFileHandler.isFileToSkip but changing behavior is beyond the scope of the issue.CreateNewDataFilesCommand
or should I call it an integragtion test for the changed classes.FileUtil.determineFileType
catches Bag exceptions The method will now returnapplication/zip
rather than throw. Without catching I would need complex mocking to get a BagitFileHandler via CDI just to test the rest.Suggestions on how to test this:
see issue
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
No
Is there a release notes update needed for this change?:
updated from comment:
Unzip during upload now supports more variations of the zip format, including the zip files generated by ownCloud.
Additional documentation: