-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"A local file header is corrupt" error after upgrading to 3.0/3.1 #1094
Comments
Thanks @Xantrul . Given the time of year, we likely can't look at this immediately. Would you consider trying 3.0 previews to get a better idea when the problem change went in? They can be installed side by side here |
@danmosemsft Totally understand re time of the year - I am on vacation myself 🎄 I'll see if someone from my team is in the office and can compare different previews, or I can do it myself (if still relevant) in January. |
@danmosemsft Tried few different versions of SDK and it looks like the problem was introduced in preview7 (preview6 works fine). |
Any news on this one? |
I've also hit this bug in my WPF app which handles large Zip files. I've tested targeting .NET framework and everything works as expected with the issue only happening with .NET Core. I'll be in Redmond next week (3rd of Feb) if you're around and need a few example files that cause problems. It'd also be great to get an understanding of when a fix might be pushed. |
@eerhardt any cycles to look at this one? Possible servicing candidate. |
Yep. I’ll take a look tomorrow. |
I was able to repro the original issue posted and debugged the problem. From what I can tell, the error is here: runtime/src/libraries/System.IO.Compression/src/System/IO/Compression/ZipBlocks.cs Lines 519 to 520 in 6a0dafb
In the originally posted issue, the uncompressed file is larger than I've pushed a branch with the fix applied: https://github.com/eerhardt/runtime/tree/Fix1094. This allows the original file to be read successfully. I'll add a test to that branch and then make a PR for the fix. We also should service /cc @buyaa-n @MikeCodesDotNET - I was also able to reproduce the same exception message with the file you sent me I'm unsure of what your error is - it may actually be a corrupt .zip file. Can you open a new issue for your scenario? For the |
Nice. The same error exists for 64 bit just above - although it's safe to say that it will never cause a problem.. compressedSize = reader.ReadInt64();
uncompressedSize = reader.ReadInt64();
|
Actually, maybe not -- it seems we use signed longs even in the public API - and there are checks elsewhere ( |
@eerhardt I've looked into the files a little more and it looks like you're correct that the Zip is invalid. I've pinged the company responsible for creating the files to see if they've any input on this. It looks like one of the images is the cause of the problem, with most zip tools will gracefully handle the errors (as .NET framework did before).
Because this seems to be expected behaviour with .NET Core, I wont log any further issues regarding this. Though, It'd be nice to have an option to extract without validation. Cheers, |
I was thinking something similar actually. Your zip file does load successfully in .NET Core 2.1, but with the extra validation we added in 3.1, it now fails. @buyaa-n @ericstj @carlossanlop @ahsonkhan @ViktorHofer - thoughts? It would probably require adding a new public API to disable the validation - unless we wanted to "quirks"-mode it, but that wouldn't be my vote. I can propose something in a new issue, if people think it would be valuable. |
It would 100% be valuable given the breaking change between 2.1 and 3.1. Though I'll leave it to experts to decide if its right to include and how :) As an FYI, given that my issue only appeared after migrating to 3.1 from Framework (I'm using WPF), should the API not keep backwards compatibility so that desktop applications which rely on this functionality continue to work? |
I ran into the same exception while opening some files contained in in large zip files from an external source. It wouldn't surprise me if they were generated by Java based tools. The files can be read without problems with the same code compiled for the .NET Framework. Converting the project to .NET Core 3.1 is now blocked by this issue. The issue in the zip files that I have is that some files are flagged as Zip64 (version needed to extract = 45 in the local file header), but the data descriptor contains 32 bit sizes (and are less than 4 GiB in size). So the check whether the size in the data descriptor matches the one in the central directory record fails. I implemented an ugly workaround for it, but then I ran into the issue that the check still fails for files with sizes >= 2 GiB (and less than 4 GiB) due to the sized incorrectly being interpreted as signed instead of unsigned. So the fix @eerhardt pushed would be very welcome, as it should make it possible to read the files again with the workaround. Anyway, 7-Zip, .NET Framework and other tools I have tried have no issues with the incorrect data descriptor. The information needed to read the data is in the central directory (and present in the ZipArchiveEntry), so not sure why the extra check has been implemented in the first place. It causes issues with files that might technically not conform to the specifications, but other software reads just fine. And I haven't found any software that can actually verify the correctness of a zip file and show the issue (to make it easier to report the problem to the organization that generated those files). So it would be nice to know which tool @MikeCodesDotNET used to check the zip file. For people who don't dive into the structure of zip files, it will seem to be a problem in .NET Core (3.0 and 3.1). If the validation is really needed, it could be an option to change the validation so that it can handle the case where a 32 bit data descriptor is present for files smaller than 4 GiB that are incorrectly tagged as Zip64. Not a nice solution either, but if there are a lot of zip files out there with similar issues, it could prevent a lot of frustration. |
We should fix the check to read as Uint32 in servicing for 3.1. You don't need API, typically we do this sort of thing with AppContext switches, which can be controlled via runtimeconfig IIRC. The benefit there is that you can do it in servicing without API. I'd prefer that we don't dirty the API here since this was clearly a bug in the validation. If we find cases where the validation is not buggy and we still flag legit zips (eg: a widespread bad behaving zip library) then we should consider adding an API to suppress the error. |
@d0tn3tc0d3r, I used the 'Advanced Diagnostics' in WinZip to validate the zip files. If you use the standard validation, it'll actually show the Zip files as valid! @ericstj As far as I can see, most zip tools do not validate, which is why WinZip and 7Zip will successfully unzip these invalide zips. Given that .NET Framework also gracefully handled the invalid zip fles, should we not try and keep backwards compatability? My preference would be add an optional boolean arguement to ExtractToDirectory method which enables or disables validation. Ideally with the default to disable validation to keep compability with .NET framework and 2.1. |
…d uint.MaxValue. This file reproduces the problem in dotnet/runtime#1094.
I think that issue should be fixed too, @eerhardt could you add fix for that with your Uint32 fix? or i can work on the fix CC @ericstj
We had issue ZipArchive extracting a tampered zip file which Uncompressed size in central directory is much bigger that real Uncompressed size #27741 |
…d uint.MaxValue. (#64) This file reproduces the problem in dotnet/runtime#1094.
@buyaa-n Thanks for the explanation of the reasons for the validation. In any case, I checked the zip file with WinZip Detailed Diagnostics, and it did not report any errors. These are the most important parts of the test results for a file that can't be read using .NET Core 3.0 or 3.1:
So, only the offset is in Zip64 format, that's why version required is 45 in the central directory. The uncompressed and compressed sizes are not specified in 64 bit format, as that is not needed for this file. That might be why it's not stored as 64 bits in the data descriptor. WinZip Detailed Diagnostics does not report that as a problem though. If needed, I can send someone a link to a file that can be downloaded. It might also be reproducible by creating a zip file that's bigger than 4 GiB and has a file that's smaller than 4 GiB and has a data descriptor, with the local file header located past the 4 GiB offset in the file. |
Yes, can you please send it to me? My email is public in my GH profile: https://github.com/eerhardt |
@eerhardt Do you think we'll get an API / the ability to gracefully handle invalid headers or should I focus my efforts on finding a workaround? Cheers, |
@d0tn3tc0d3r the zip you mentioned above and the zip @MikeCodesDotNET posted before are invalid zips. But as other tools creating them and accepting them we might update the API to allow the zip files having that minor issue, it would be helpful if you can share your file @MikeCodesDotNET for testing |
@MikeCodesDotNET never mind, I got the file from @eerhardt |
The current plan is to introduce an AppContext switch into both Along with that switch, we are also fixing the bug mentioned above - changing |
@eerhardt Wonderful! This is perfect. I'll keep badgering the original zip creators to fix it their end as well but I appreciate the ability to handle it on the .NET side as well. Thank you!! |
@buyaa-n based on which facts are you saying that the zip I'm reading is invalid? WinZip doesn't report it as invalid, and I haven't found any tool that does. Nobody else has reported having problems with these files as far as I know, and I have old files going back nearly 4 years that have the same issue. Right now, I don't think I can report the files as being invalid. The specifications regarding the data descriptor are a bit lacking in my opinion. It does not mention at all how a reader can know whether the data descriptor will have 64 bit or 32 bit sizes when reading the zip in streaming mode. And when writing in streaming mode, the writer doesn't need to know what the sizes will be at the time the local file header is written, so it can't be based on information in that header either. So I think that using the "version needed to extract" from the central directory to decide whether to read the sizes in the data descriptor as 32 bit or 64 bit values (and throw an exception if it's not the case) is not a good implementation. Anyway, I already sent @eerhardt a link, but I sent it to @MikeCodesDotNET as well. |
@d0tn3tc0d3r sorry i told that based on your previous comment:
Which sounded like invalid zip to me, but with debugging your file we see the issue same as you described above which is a bug, we are working on the fix, thanks for the file |
Looking at the implementation that used to be in the dotnet/wpf repo, they used a more forgiving algorithm for reading the data descriptor. Maybe using the same algorithm would work for us. I believe it would fix @d0tn3tc0d3r's scenario where some data descriptors have 4 byte lengths and some have 8 byte lengths. Looking at the spec https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT, the rule seems ambiguous to me:
The
An example of how these are different is that the I assume that is why the dotnet/wpf implementation tries reading it each way and comparing the values with the central directory entry to figure out which way it was written. |
While experimenting a bit with the .NET Core ZipArchive, I found a way to create a zip file containing a file with a data descriptor that has 32 bit sizes, but is at an offset of more than 4 GiB. .NET Core 3.1 (without fix for the signed int issue) does not read any of the entries in the zip generated by this code. using System;
using System.IO;
using System.IO.Compression;
namespace CreateZip
{
public class NoSeekFileStream : FileStream
{
public NoSeekFileStream(string path, FileMode mode) : base(path, mode) { }
public override bool CanSeek => false;
}
class Program
{
static void Main(string[] args)
{
using (var zipStream = new NoSeekFileStream("DotNETCoreTest.zip", FileMode.Create))
using (var zipArchive = new ZipArchive(zipStream, ZipArchiveMode.Create))
{
var buffer = new byte[0x10000];
using (var stream = zipArchive.CreateEntry("test1.bin", CompressionLevel.NoCompression).Open())
for (int i = 0; i < 0x8000; i++) stream.Write(buffer);
using (var stream = zipArchive.CreateEntry("test2.bin", CompressionLevel.NoCompression).Open())
for (int i = 0; i < 0x8000; i++) stream.Write(buffer);
using (var stream = zipArchive.CreateEntry("test3.bin", CompressionLevel.NoCompression).Open())
stream.Write(buffer, 0, 0x2A);
}
}
}
} So I guess the people that implemented the support for writing zip files to a stream can explain their interpretation of the zip specifications. SharpZipLib can also generate zip files that have a similar structure, so there are multiple implementations that do the same thing. |
We have removed local header validation vs central directory, therefore no need AppContext switch. Decompressed stream will still be restricted by uncompressed size. Fix merged to 5.0, reopening for porting to 3. 1 |
Hello, I just wanted to share an ugly workaround for those who needs to stay on 3.1 and can't wait for the release of this fix. Instead of calling : firstEntry.Open(); Call this private firstEntry.GetType()
.GetMethod("OpenInReadMode", BindingFlags.NonPublic | BindingFlags.Instance)
.Invoke(firstEntry, new object[] {false}) as Stream; |
We have an application that reads and process information from zip files. After we upgraded target framework to 3.0 (or 3.1) from 2.1 without upgrading the code, processing of larger zip files started to fail with:
System.IO.InvalidDataException: A local file header is corrupt.
This issue blocks us from upgrading to dotnet 3.1 and we need to upgrade asap to resolve the out of memory issues 2.1 has while running in container. I spent some time narrowing down the issue and here's what I found...
Sample code to repro the issue is:
While running this code on the same zip file, it succeeds when target framework is
netcoreapp2.1
, but fails when targetingnetcoreapp3.0
ornetcoreapp3.1
. All tests were performed on the same machine with .NET Core SDK 3.1.100The hardest part was to produce a brand new zip to repro the problem. Turned out that 2 things are required to get a repro zip:
zip
, but no luck with any of those.I am including the code I used to generate repro zip below, but I can also supply this zip if you tell me where to upload it (315MB).
C# code I used to generate repro file:
Java code to zip the file:
The text was updated successfully, but these errors were encountered: