Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-3225 & AVRO-3226: Fix possible StackOverflowException and OutOfMemoryException on invalid input #1357

Conversation

PSanetra
Copy link
Contributor

@PSanetra PSanetra commented Oct 7, 2021

Jira

Tests

  • My PR adds the following unit tests:
    • BinaryCodecTests.TestStringReadIntoArrayPool (replaces BinaryCodecTests.TestLargeString)
    • BinaryCodecTests.TestStringReadByBinaryReader
    • BinaryCodecTests.TestInvalidInputWithNegativeStringLength
    • BinaryCodecTests.TestInvalidInputWithMaxIntAsStringLength
    • BinaryCodecTests.TestInvalidInputWithMaxArrayLengthAsStringLength

Commits

  • My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain Javadoc that explain what it does

@github-actions github-actions bot added the C# label Oct 7, 2021
@PSanetra PSanetra changed the title AVRO-3225: Fix possible StackOverflowException on invalid input AVRO-3225 & AVRO-3226: Fix possible StackOverflowException and OutOfMemoryException on invalid input Oct 7, 2021
@PSanetra PSanetra force-pushed the AVRO-3225-fix-possible-stack-overflow-exception branch from c38e58e to 2b8b64b Compare October 7, 2021 17:06
@PSanetra PSanetra force-pushed the AVRO-3225-fix-possible-stack-overflow-exception branch from 314d662 to 95c2c28 Compare October 8, 2021 14:10
@iemejia
Copy link
Member

iemejia commented Oct 9, 2021

R: @blachniet mind to take a look please
CC: @RyanSkraba this fix (when merged) should be cherry picked into 1.11.0

}

[Test]
public void TestInvalidInputWithMaxArrayLengthAsStringLength()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is failing when running on .NET Framework 4.6.1. It looks like we aren't running the Windows builds/tests in GitHub actions anymore, which would explain why that didn't catch it.

+ dotnet test --configuration Release --no-build --filter 'TestCategory!=Interop' Avro.sln
Test run for C:\Users\Brian.Lachniet\code\github.com\apache\avro\lang\csharp\src\apache\test\bin\Release\net461\Avro.test.dll (.NETFramework,Version=v4.6.1)
Microsoft (R) Test Execution Command Line Tool Version 16.11.0
Copyright (c) Microsoft Corporation.  All rights reserved.

Starting test execution, please wait...
A total of 1 test files matched the specified pattern.
  Failed TestInvalidInputWithMaxArrayLengthAsStringLength [75 ms]
  Error Message:
     Expected: <Avro.AvroException>
  But was:  <System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
   at System.IO.BinaryReader.ReadBytes(Int32 count)
   at Avro.IO.BinaryDecoder.ReadString() in C:\Users\Brian.Lachniet\code\github.com\apache\avro\lang\csharp\src\apache\main\IO\BinaryDecoder.netstandard2.0.cs:line 90
   at Avro.Test.BinaryCodecTests.<>c__DisplayClass11_0.<TestInvalidInputWithMaxArrayLengthAsStringLength>b__0() in C:\Users\Brian.Lachniet\code\github.com\apache\avro\lang\csharp\src\apache\test\IO\BinaryCodecTests.cs:line 304
   at NUnit.Framework.Assert.Throws(IResolveConstraint expression, TestDelegate code, String message, Object[] args)>

  Stack Trace:
     at Avro.Test.BinaryCodecTests.TestInvalidInputWithMaxArrayLengthAsStringLength() in C:\Users\Brian.Lachniet\code\github.com\apache\avro\lang\csharp\src\apache\test\IO\BinaryCodecTests.cs:line 304


Failed!  - Failed:     1, Passed:   626, Skipped:     0, Total:   627, Duration: 3 s - Avro.test.dll (net461)
Test run for C:\Users\Brian.Lachniet\code\github.com\apache\avro\lang\csharp\src\apache\test\bin\Release\netcoreapp2.1\Avro.test.dll (.NETCoreApp,Version=v2.1)

The Remarks on the Array Class indicate that the .NET Framework may have a smaller size limit. However, even after setting MaxDotNetArrayLength = 0x77359400;, I'm getting the same test failure (only on the net461 tests).

@RyanSkraba , how does the Java implementation handle large strings? Does it specify some arbitrary limit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Java implementation seems also just to use the MAX_ARRAY_SIZE as limit:

public Utf8 readString(Utf8 old) throws IOException {
long length = readLong();
if (length > MAX_ARRAY_SIZE) {
throw new UnsupportedOperationException("Cannot read strings longer than " + MAX_ARRAY_SIZE + " bytes");
}
if (length < 0L) {
throw new AvroRuntimeException("Malformed data. Length is negative: " + length);
}
Utf8 result = (old != null ? old : new Utf8());
result.setByteLength((int) length);
if (0L != length) {
doReadBytes(result.getBytes(), 0, (int) length);
}
return result;
}

@blachniet can you fix the test for .NET Framework 4.6.1? It is hard for me to test this on windows.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the policy for the supported .NET versions in Avro C# ?
According to https://dotnet.microsoft.com/platform/support/policy/dotnet-core only 3.1 (LTS) and 5.0 (Current) are supported by Microsoft.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, that's confusing. According to https://docs.microsoft.com/en-us/lifecycle/faq/dotnet-framework#what-is-the-lifecycle-policy-for-different-versions-of--net-framework- there's quite a few supported versions in the 4.x range...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blachniet @RyanSkraba I have changed the constant MaxDotNetArrayLength for .Net Framework 4.6.1 to 0x3FFFFFFF. This seems to work. I could not find any correct official documentation. Thanks @superkartoffel for helping debugging this on windows.

@RyanSkraba
Copy link
Contributor

Hello! These two JIRA seem pretty important to include in the 1.11.0 RC2 -- I tried giving the C# SDK a compile in an Azure Windows VM, but it's a pretty steep learning curve. Can anyone help out?

@PSanetra PSanetra force-pushed the AVRO-3225-fix-possible-stack-overflow-exception branch from 95c2c28 to 8f0cb1c Compare October 15, 2021 12:26
@RyanSkraba RyanSkraba mentioned this pull request Oct 19, 2021
4 tasks
Copy link
Contributor

@RyanSkraba RyanSkraba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM pragmatically -- since it builds and runs correctly both in Windows and linux when combined with PR#1375
@blachniet are you happy with the changes you've requested? If we can merge these two, I can cherry-pick them to branch-1.11 and create a new release candidate!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants