Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zip 2.2.2 scans for large parts of the file while opening a ZipArchive #280

Open
jrudolph opened this issue Jan 17, 2025 · 1 comment · May be fixed by #281
Open

zip 2.2.2 scans for large parts of the file while opening a ZipArchive #280

jrudolph opened this issue Jan 17, 2025 · 1 comment · May be fixed by #281
Labels
bug Something isn't working

Comments

@jrudolph
Copy link

Describe the bug

(I know about recently closed #231)

I tried 2.2.2 on a 100+GB zip file on a network storage, and get_metadata still scans large parts of the file because of this validation in central_header_to_zip_file:

zip2/src/read.rs

Lines 1071 to 1077 in 7c20fa3

let data_start = find_data_start(&file, reader)?;
if data_start > central_directory.directory_start {
return Err(InvalidArchive(
"File data can't start after the central directory",
));
}

There might be a reason (an invariant) for this check to run already when opening the archive, but for random access of big files (especially over the network), it would be good if those checks would be deferred until the file is accessed.

Full stack trace:

   3: zip::read::find_data_start
             at /home/johannes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zip-2.2.2/src/read.rs:352:5
   4: zip::read::central_header_to_zip_file
             at /home/johannes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zip-2.2.2/src/read.rs:1071:22
   5: zip::read::<impl zip::read::zip_archive::ZipArchive<R>>::read_central_header
             at /home/johannes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zip-2.2.2/src/read.rs:654:24
   6: zip::read::<impl zip::read::zip_archive::ZipArchive<R>>::get_metadata::{{closure}}
             at /home/johannes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zip-2.2.2/src/read.rs:616:34
   7: core::result::Result<T,E>::and_then
             at /home/johannes/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:1348:22
   8: zip::read::<impl zip::read::zip_archive::ZipArchive<R>>::get_metadata
             at /home/johannes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zip-2.2.2/src/read.rs:615:30
   9: zip::read::<impl zip::read::zip_archive::ZipArchive<R>>::with_config
             at /home/johannes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zip-2.2.2/src/read.rs:714:22
  10: zip::read::<impl zip::read::zip_archive::ZipArchive<R>>::new
             at /home/johannes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/zip-2.2.2/src/read.rs:707:9
@jrudolph jrudolph added the bug Something isn't working label Jan 17, 2025
@jrudolph
Copy link
Author

Commenting out those lines fixes my use case but leads to (at least) these tests failing:

failures:
    write::test::fuzz_crash_2024_06_21
    write::test::fuzz_crash_2024_07_19
    write::test::test_deep_copy
    write::test::test_fuzz_crash_2024_06_14
    write::test::test_fuzz_crash_2024_06_14c
    write::test::test_fuzz_crash_2024_06_17a
    write::test::test_fuzz_crash_2024_06_18b
    write::test::test_fuzz_failure_2024_06_08

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant