-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zipfile lib overwrites the extra field during closing when the archive size is more then ZIP64_LIMIT #88233
Comments
The current zipFile implementation supports the allowZip64,which can make large zip files. To reproduce it:
When I open the zip again, the files processed after the ZIP64_LIMIT is reached will have their extra fields overwritten. I have attached the zip that contained the python used to add the new file and the images of zip archive before adding new files and after. |
The issue stems from the following code inside the if zinfo.header_offset > ZIP64_LIMIT:
extra.append(zinfo.header_offset)
header_offset = 0xffffffff
else:
header_offset = zinfo.header_offset
extra_data = zinfo.extra
min_version = 0
if extra:
# Append a ZIP64 field to the extra's
extra_data = _strip_extra(extra_data, (1,))
extra_data = struct.pack(
'<HH' + 'Q'*len(extra),
1, 8*len(extra), *extra) + extra_data
min_version = ZIP64_VERSION |
I was looking at Here's a fixed version: def _strip_extra(extra, xids):
# Remove Extra Fields with specified IDs.
unpack = _EXTRA_FIELD_STRUCT.unpack
modified = False
buffer = []
start = i = 0
while i + 4 <= len(extra):
xid, xlen = unpack(extra[i : i + 4])
j = i + 4 + xlen
if xid in xids:
if i != start:
buffer.append(extra[start : i])
start = j
modified = True
i = j
if i != start:
buffer.append(extra[start : i])
if not modified:
return extra
return b''.join(buffer) Or this one, easier to understand: def _strip_extra(extra, xids):
# Remove Extra Fields with specified IDs.
unpack = _EXTRA_FIELD_STRUCT.unpack
modified = False
buffer = []
i = 0
while i + 4 <= len(extra):
xid, xlen = unpack(extra[i : i + 4])
j = i + 4 + xlen
if xid in xids:
modified = True
else:
buffer.append(extra[i : j])
i = j
if not modified:
return extra
return b''.join(buffer) Not sure which one is better. |
I found the same bug while working on another issue. I prefer the easier-to-understand one. |
Previously, any data _after_ the zip64 extra would be removed. With many new tests. Fixes python#88233
I ended up not going with the easier-to-understand version, because it failed to pass through 1-3 bytes of garbage after the last extra, which we pass through in the "not modified" case already. |
Previously, any data _after_ the zip64 extra would be removed. With many new tests. Fixes python#88233
Previously, any data _after_ the zip64 extra would be removed. With many new tests. Fixes python#88233
In an attempt to make the code easier to comprehend and less prone to error, I drafted #102084. |
…96161) Previously, any data _after_ the zip64 extra would be removed. With many new tests. Fixes pythonGH-88233 (cherry picked from commit 59e86ca) Co-authored-by: Tim Hatch <[email protected]> Automerge-Triggered-By: GH:jaraco
…96161) Previously, any data _after_ the zip64 extra would be removed. With many new tests. Fixes pythonGH-88233 (cherry picked from commit 59e86ca) Co-authored-by: Tim Hatch <[email protected]> Automerge-Triggered-By: GH:jaraco
Previously, any data _after_ the zip64 extra would be removed. With many new tests. Fixes GH-88233 (cherry picked from commit 59e86ca) Co-authored-by: Tim Hatch <[email protected]> Automerge-Triggered-By: GH:jaraco
* main: (60 commits) pythongh-102056: Fix a few bugs in error handling of exception printing code (python#102078) pythongh-102011: use sys.exception() instead of sys.exc_info() in docs where possible (python#102012) pythongh-101566: Sync with zipp 3.14. (pythonGH-102018) pythonGH-99818: improve the documentation for zipfile.Path and Traversable (pythonGH-101589) pythongh-88233: zipfile: handle extras after a zip64 extra (pythonGH-96161) pythongh-101981: Apply HOMEBREW related environment variables (pythongh-102074) pythongh-101907: Stop using `_Py_OPCODE` and `_Py_OPARG` macros (pythonGH-101912) pythongh-101819: Adapt _io types to heap types, batch 1 (pythonGH-101949) pythongh-101981: Build macOS as recommended by the devguide (pythonGH-102070) pythongh-97786: Fix compiler warnings in pytime.c (python#101826) pythongh-101578: Amend PyErr_{Set,Get}RaisedException docs (python#101962) Misc improvements to the float tutorial (pythonGH-102052) pythongh-85417: Clarify behaviour on branch cuts in cmath module (python#102046) pythongh-100425: Update tutorial docs related to sum() accuracy (FH-101854) Add missing 'is' to `cmath.log()` docstring (python#102049) pythongh-100210: Correct the comment link for unescaping HTML (python#100212) pythongh-97930: Also include subdirectory in makefile. (python#102030) pythongh-99735: Use required=True in argparse subparsers example (python#100927) Fix incorrectly documented attribute in csv docs (python#101250) pythonGH-84783: Make the slice object hashable (pythonGH-101264) ...
* main: (225 commits) pythongh-102056: Fix a few bugs in error handling of exception printing code (python#102078) pythongh-102011: use sys.exception() instead of sys.exc_info() in docs where possible (python#102012) pythongh-101566: Sync with zipp 3.14. (pythonGH-102018) pythonGH-99818: improve the documentation for zipfile.Path and Traversable (pythonGH-101589) pythongh-88233: zipfile: handle extras after a zip64 extra (pythonGH-96161) pythongh-101981: Apply HOMEBREW related environment variables (pythongh-102074) pythongh-101907: Stop using `_Py_OPCODE` and `_Py_OPARG` macros (pythonGH-101912) pythongh-101819: Adapt _io types to heap types, batch 1 (pythonGH-101949) pythongh-101981: Build macOS as recommended by the devguide (pythonGH-102070) pythongh-97786: Fix compiler warnings in pytime.c (python#101826) pythongh-101578: Amend PyErr_{Set,Get}RaisedException docs (python#101962) Misc improvements to the float tutorial (pythonGH-102052) pythongh-85417: Clarify behaviour on branch cuts in cmath module (python#102046) pythongh-100425: Update tutorial docs related to sum() accuracy (FH-101854) Add missing 'is' to `cmath.log()` docstring (python#102049) pythongh-100210: Correct the comment link for unescaping HTML (python#100212) pythongh-97930: Also include subdirectory in makefile. (python#102030) pythongh-99735: Use required=True in argparse subparsers example (python#100927) Fix incorrectly documented attribute in csv docs (python#101250) pythonGH-84783: Make the slice object hashable (pythonGH-101264) ...
#102087) Previously, any data _after_ the zip64 extra would be removed. With many new tests. Fixes GH-88233 (cherry picked from commit 59e86ca) Co-authored-by: Tim Hatch <[email protected]>
* Refactor zipfile._strip_extra to use higher level abstractions for extras instead of a heavy-state loop. * Add blurb * Remove _strip_extra and use _Extra.strip directly. * Use memoryview to avoid unnecessary copies while splitting Extras.
* Refactor zipfile._strip_extra to use higher level abstractions for extras instead of a heavy-state loop. * Add blurb * Remove _strip_extra and use _Extra.strip directly. * Use memoryview to avoid unnecessary copies while splitting Extras.
…96161) Previously, any data _after_ the zip64 extra would be removed. With many new tests. Fixes python#88233 Automerge-Triggered-By: GH:jaraco
…96161) Previously, any data _after_ the zip64 extra would be removed. With many new tests. Fixes python#88233 Automerge-Triggered-By: GH:jaraco
* Refactor zipfile._strip_extra to use higher level abstractions for extras instead of a heavy-state loop. * Add blurb * Remove _strip_extra and use _Extra.strip directly. * Use memoryview to avoid unnecessary copies while splitting Extras.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: