Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

const array_t::operator[] results in buffer overflow / segv on nullptr on out of bounds access #3973

Closed
2 tasks
andrey-golubev opened this issue Mar 10, 2023 · 9 comments
Labels
solution: wontfix the issue will not be fixed (either it is impossible or deemed out of scope) state: please discuss please discuss the issue or vote for your favorite option

Comments

@andrey-golubev
Copy link

andrey-golubev commented Mar 10, 2023

Description

operator[] called on a constant array_t object results in a heap-buffer-overflow (if non-empty array) or a segv on nullptr (if empty array), as reported by ASan, when accessing an element outside of the bounds.

Reproduction steps

    auto array = nlohmann::json::array();
    array.push_back(42);
    auto value = static_cast<const decltype(array)&>(array)[1];  // heap buffer overflow
    auto array = nlohmann::json::array();
    auto value = static_cast<const decltype(array)&>(array)[0];  // nullptr read

Expected vs. actual results

Expected: the code should likely assert in this situation. For instance, at() behaves neatly and reports an out-of-range exception. This might be too much to ask of operator[], but at least an assert is vital.

Actual: a heap buffer overflow: at best you have an address sanitizer which would immediately catch that, at worst it's a security vulnerability (?) -- and you would see an error later when e.g. attempting to convert a returned value into something (e.g. auto value = empty_const_array[0]; /* throws/asserts: */ value.get<int>();).

Bottomline: if you assert right away, this is a much safer behavior. Given JSON_ASSERT could be overwritten (e.g. to add some extra error information), the overall behavior is also much nicer in this case. And you don't get ASan to terminate your program immediately.

Minimal code example

auto array = nlohmann::json::array();
array.push_back(42);
auto value = static_cast<const decltype(array)&>(array)[1];  // heap buffer overflow

Error messages

==61129==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000340 at pc 0x55cd5083c37f bp 0x7ffc3732ff00 sp 0x7ffc3732fef0
READ of size 1 at 0x602000000340 thread T0
    #0 0x55cd5083c37e in nlohmann::json_abi_v3_11_2::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_2::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> > >::basic_json(nlohmann::json_abi_v3_11_2::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, unsigned long, double, std::allocator, nlohmann::json_abi_v3_11_2::adl_serializer, std::vector<unsigned char, std::allocator<unsigned char> > > const&) nlohmann_json/include/nlohmann/json.hpp:1135


> Note: providing a limited stack trace due to privacy concers

Compiler and operating system

Ubuntu 22.04.1 LTS, c++ (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 (GCC)

Library version

v3.11.2

Validation

@andrey-golubev andrey-golubev changed the title const array_t::operator[] results in buffer overflow / segv on nullptr const array_t::operator[] results in buffer overflow / segv on nullptr on out of bounds access Mar 13, 2023
@nlohmann
Copy link
Owner

From the documentation:

If the element with key idx does not exist, the behavior is undefined.

Just as in the other STL containers, operator[] implements unchecked access. Why do you consider the current behavior as bug?

@nlohmann nlohmann added the state: please discuss please discuss the issue or vote for your favorite option label Mar 13, 2023
@andrey-golubev
Copy link
Author

andrey-golubev commented Mar 14, 2023

It is bug-prone to be honest, not a bug, since UB is not great. (I haven't figured whether there's a "suggestion" ticket so created a "bug report").
My main point is: this could be neatly wrapped in an assertion - which is a de-facto facility to do the "terminate if the contract is violated". In the case of assertion you have an explicit library-side failure instead of the UB.

(OTOH, one could argue that adding an assertion here would make people abuse it to streamline the validation)

The fact that STL doesn't assert is a horrible user experience to be fair (I mean, unlikely there's anyone in this world who would say "hey, it's okay, I want UB instead of an immediate failure").
Microsoft's standard library used to do all sorts of debug-build checks (e.g. sentinels, validating iterators, etc.) exactly to handle unintentional user errors, (it's a guess but) likely operator[] out-of-bounds access in vector/array are also assert-proofed because it's a rather popular user error.

Again, from my point of view, adding an explicit JSON_ASSERT() somewhere inside the operator[] makes the user experience much better. By default, it's an assert() call, so termination (yay) and if the user overwrites the macro, then it's a user error that they have to deal with.

By the way, I think heap-buffer-overflow could be considered a security vulnerability as well (though, it's tricky since you could consider it a user's fault), and being able to immediately abort is a decent middle-ground for all the parties.

@andrey-golubev
Copy link
Author

Note that in my mental model:

  • JSON_ASSERT() == enforce that the contract is satisfied (a.k.a. "class invariants are valid" from your documentation)
  • Checked access failure == exception thrown != contract validation

In this particular case, I propose to enforce that the invariants are valid by adding an assertion.

@nlohmann
Copy link
Owner

I am hesitating to add these checks as they would slow down debug builds even further.

@andrey-golubev
Copy link
Author

Fair enough. If the debug build speed is of concern, then it is somewhat troublesome to add this right away.

Generally speaking, debug build improvements is probably a separate topic.

P.S.: I'd argue whoever couldn't stand the slow down from the asserts, would likely disable them completely though.

@nlohmann
Copy link
Owner

I don't think the code should be changed. Is there anything that you would like seen added to the documentation?

@nlohmann nlohmann added solution: wontfix the issue will not be fixed (either it is impossible or deemed out of scope) and removed kind: bug labels Apr 22, 2023
@andrey-golubev
Copy link
Author

The docs were fine as far as I recall, the thing is/was really about an assertion in the operator[] code path to greatly improve the usability.

@kbarry-aurora
Copy link

I agree that the semantics are inconsistent, and that eliminating existence/bounds checks can greatly improve readability if the code is already handling exceptions.

  1. If I do object.get<double>() but it's an object, I get an exception.

    const nlohmann::json object = { { "foo", "bar" } };
    object.get<double>();  // exception
  2. If I do object["foo"] but it's a non-object (e.g., array, number), I get an exception.

    const nlohmann::json object = 1.0;
    object["foo"];  // exception
  3. If I do object["foo"] but "foo" isn't a valid field, I get a crash that can't be handled.

    const nlohmann::json object = { { "bar", "baz" } };
    object["foo"];  // abort

This led to a latent bug in a system that we rolled out because we assumed that invalid operations always resulted in exceptions and not UB.

I understand the argument regarding undefined behavior, but using the STL for the underlying representation is an implementation detail that the library users shouldn't need to account for.


In any case, thanks for a great library!

@nlohmann
Copy link
Owner

If you want exceptions, use at().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solution: wontfix the issue will not be fixed (either it is impossible or deemed out of scope) state: please discuss please discuss the issue or vote for your favorite option
Projects
None yet
Development

No branches or pull requests

3 participants