Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAX interface unexpectedly gets locale-altered float representation. #4084

Closed
2 tasks
jzakrzewski opened this issue Jul 25, 2023 · 5 comments · Fixed by #4505
Closed
2 tasks

SAX interface unexpectedly gets locale-altered float representation. #4084

jzakrzewski opened this issue Jul 25, 2023 · 5 comments · Fixed by #4505
Assignees
Labels
confirmed kind: bug solution: proposed fix a fix for the issue has been proposed and waits for confirmation

Comments

@jzakrzewski
Copy link

jzakrzewski commented Jul 25, 2023

Description

I didn't expect (neither from experience, nor from the documentation) that the float string representation would be altered according to the locale when feeded into the SAX parser.
I understand, why it has to be altered internally, but I'd expect the original string in the callback.

Reproduction steps

LC_NUMERIC=de_DE.UTF-8

{ "aFloat": 42.0 }

bool ParserImpl::number_float( number_float_t val, const string_t &s )
{
    std::cout << s << '\n';
}

Expected vs. actual results

Expected: 42.0
Actual: 42,0

Minimal code example

#include <nlohmann/json.hpp>
#include <iostream>
#include <string_view>
#include <clocale>

using Json = nlohmann::json;

struct ParserImpl : public nlohmann::json_sax<Json>
{
    bool null() override { return true; };
    bool boolean( bool val ) override { return true; };
    bool number_integer( number_integer_t val ) override { return true; };
    bool number_unsigned( number_unsigned_t val ) override { return true; };
    bool number_float( number_float_t val, const string_t &s ) override {
        std::cout << s << '\n';
        return true; 
    };
    bool string( string_t &val ) override { return true; };
    bool binary( binary_t &val ) override { return true; };
    bool start_object( std::size_t elements ) override { return true; };
    bool key( string_t &val ) override { return true; };
    bool end_object() override { return true; };
    bool start_array( std::size_t elements ) override { return true; };
    bool end_array() override { return true; };
    bool parse_error( std::size_t position, const std::string &last_token,
                      const nlohmann::detail::exception &ex ) override { return true; };
};

int main() {
    if(!setlocale(LC_NUMERIC, "de_DE.utf8"))
        return -1;
    std::string_view data { R"json({ "aFloat": 42.0 })json" };
    ParserImpl sax {};
    Json::sax_parse( data, &sax );
    return 0;
}

g++-12 -o jsontest -I /path/to/nlohmann_json-src/single_include/ --std=c++20 jsontest.cpp

Error messages

No response

Compiler and operating system

Linux/gcc-12, Windows/MSVC19.35

Library version

3.11.2

Validation

@nlohmann
Copy link
Owner

Can you please share a complete example?

@jzakrzewski
Copy link
Author

Done.
Note that you need the "de_DE.utf8" locale in your system. Alternatively use any other locale with something else than a "." as decimal separator.

@nlohmann
Copy link
Owner

Thanks!

@nlohmann
Copy link
Owner

The error is that we replace . in the input by the decimal separator from the locale to be able to use std::strtod under the hood to make sure we properly parse floating-point numbers when a different locale is set. We forgot, however, that the string we pass to the SAX number_float function still contains the replaced separator.

To fix this, I need a way to efficiently replace this to . once we're done with std::strtod.

@nlohmann nlohmann self-assigned this Nov 20, 2024
@nlohmann nlohmann added the solution: proposed fix a fix for the issue has been proposed and waits for confirmation label Nov 20, 2024
@nlohmann nlohmann added this to the Release 3.11.4 milestone Nov 20, 2024
@nlohmann
Copy link
Owner

It took me a while to adjust the CI to run the tests with different locales, but I think the fix is good to go.

nlohmann added a commit that referenced this issue Nov 24, 2024
* 🐛 make SAX output locale-independent #4084

* ✅ add test

* ✅ add test

* ✅ add test

* ✅ add test

* ✅ add test

* ✅ add test

* 🐛 make SAX output locale-independent #4084

* 🐛 make SAX output locale-independent #4084

* 🐛 make SAX output locale-independent #4084

* 🐛 make SAX output locale-independent #4084

* 🐛 make SAX output locale-independent #4084

* 🐛 make SAX output locale-independent #4084

* 🐛 make SAX output locale-independent #4084

* 🐛 make SAX output locale-independent #4084

* 🐛 make SAX output locale-independent #4084

* 🐛 make SAX output locale-independent #4084

* 🐛 make SAX output locale-independent #4084

* 🐛 make SAX output locale-independent #4084
slowriot pushed a commit to slowriot/json that referenced this issue Jan 10, 2025
* 🐛 make SAX output locale-independent nlohmann#4084

* ✅ add test

* ✅ add test

* ✅ add test

* ✅ add test

* ✅ add test

* ✅ add test

* 🐛 make SAX output locale-independent nlohmann#4084

* 🐛 make SAX output locale-independent nlohmann#4084

* 🐛 make SAX output locale-independent nlohmann#4084

* 🐛 make SAX output locale-independent nlohmann#4084

* 🐛 make SAX output locale-independent nlohmann#4084

* 🐛 make SAX output locale-independent nlohmann#4084

* 🐛 make SAX output locale-independent nlohmann#4084

* 🐛 make SAX output locale-independent nlohmann#4084

* 🐛 make SAX output locale-independent nlohmann#4084

* 🐛 make SAX output locale-independent nlohmann#4084

* 🐛 make SAX output locale-independent nlohmann#4084

* 🐛 make SAX output locale-independent nlohmann#4084
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed kind: bug solution: proposed fix a fix for the issue has been proposed and waits for confirmation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants