Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<chrono>: time_zone::get_info code is excessively slow #5304

Closed
pps83 opened this issue Feb 18, 2025 · 4 comments
Closed

<chrono>: time_zone::get_info code is excessively slow #5304

pps83 opened this issue Feb 18, 2025 · 4 comments
Labels
chrono C++20 chrono duplicate This issue or pull request already exists performance Must go faster

Comments

@pps83
Copy link
Contributor

pps83 commented Feb 18, 2025

Describe the bug

I used abseil for timezone code and wanted to switch to standard std::chrono that supports timezones now. Code works, but some time later I've noticed perf issues. After debugging, it appears that time_zone::get_into is more than 100x slower in MS chrono::timezone code compared to abseil or chrono with g++.

sample code bellow. Note, there is _putenv("TZDIR=...") for abseil, adjust it for your absl location. Also, if easier, just comment out abseil and use chrono code with ms and g++ compilers to compare timing.

#include <stdint.h>
#include <vector>
#include <iostream>
#include <absl/time/time.h>
void compare_absl_chrono()
{
    std::vector<int64_t> absl_result;
    std::vector<int64_t> chrono_result;

    int64_t ms_chrono, ms_absl;

    if (1) // measure chrono perf
    {
        using namespace std::chrono;
        const auto tz = locate_zone("America/New_York");

        auto t0 = std::chrono::high_resolution_clock::now();

        auto dateFrom = sys_days{year{1900} / January / 1};
        auto info = tz->get_info(dateFrom);
        auto offset = info.offset;
        auto save = info.save;

        for (int y = 0; y < 140; ++y)
        {
            for (int i = 0; i < 365; ++i)
            {
                dateFrom += days(1);
                info = tz->get_info(dateFrom);
                if (offset != info.offset || save != info.save)
                {
                    offset = info.offset;
                    save = info.save;
                    chrono_result.push_back(sys_seconds(dateFrom).time_since_epoch().count());
                }
            }
        }
        auto t1 = std::chrono::high_resolution_clock::now();
        ms_chrono = std::chrono::duration_cast<std::chrono::milliseconds>(t1 - t0).count();
    }

    if (1) // measure absl perf
    {
        _putenv("TZDIR=C:\\work\\abseil-cpp\\absl\\time\\internal\\cctz\\testdata\\zoneinfo");
        absl::TimeZone nyc;
        if (!absl::LoadTimeZone("America/New_York", &nyc))
            throw std::runtime_error("failed to load NY timezone info. Verify that TZDIR is set");

        auto t0 = std::chrono::high_resolution_clock::now();

        absl::CivilSecond ct(1900);
        absl::Time startTime1 = absl::FromCivil(ct, absl::UTCTimeZone());
        int offset = nyc.At(startTime1).offset;
        bool is_dst = nyc.At(startTime1).is_dst;

        for (int y = 0; y < 140; ++y)
        {
            for (int i = 0; i < 365; ++i)
            {
                startTime1 += absl::Seconds(60 * 60 * 24);
                absl::TimeZone::CivilInfo ci = nyc.At(startTime1);
                if (offset != ci.offset || is_dst != ci.is_dst)
                {
                    offset = ci.offset;
                    is_dst = ci.is_dst;
                    auto t = (startTime1 - absl::UnixEpoch()) / absl::Seconds(1);
                    absl_result.push_back(t);
                }
            }
        }

        auto t1 = std::chrono::high_resolution_clock::now();
        ms_absl = std::chrono::duration_cast<std::chrono::milliseconds>(t1 - t0).count();
    }

    printf("data same:%d\n", absl_result == chrono_result);
    printf("absl_result size:%zu, time:%lldms\n", absl_result.size(), ms_absl);
    printf("chrono_result size:%zu, time:%lldms\n", chrono_result.size(), ms_chrono);
}

This code simply walks 140 years day by day and asks for timezone info for each of these days.
When timezone info changes, it's stores day of the change as a unix timestamp. This part is simply to verify that the results from absl timezone and chono timezone are the same. When I run x64 release build on Win11 I get this output:

data same:1
absl_result size:238, time:6ms
chrono_result size:238, time:1904ms

basically absl and chrono produce the same result, but it takes 400x more time to run chrono timezone code. Not sure what the reason is, but it certainly shouldn't be this slow. It's not even a caching issue imo. IANA timezone db is only like 500KB (and I access only EST timezone, which is perhaps less than 10KB total).
When I run this code on my PC compiled with mingw64 (version 14.2.0) I for chrono I get this output:

Image

it's 1000x faster on mingw64

@pps83
Copy link
Contributor Author

pps83 commented Feb 18, 2025

what a shocking surprise, std c++ code relies on icu. Feels like the issue is never going to be fixed

@StephanTLavavej StephanTLavavej added performance Must go faster chrono C++20 chrono labels Feb 19, 2025
@StephanTLavavej
Copy link
Member

Resolving as a duplicate of #2842.

We chose to take a dependency on the OS's copy of ICU because the alternative would be to ship a copy of the IANA timezone database within the STL, which would be problematic for a couple of reasons: (1) it would significantly increase the size of the STL, and (2) it couldn't be automatically updated in response to timezone changes (due to legislation etc.). (Consider an application that's built with MSVC, deployed to an end-user's machine, and then runs unchanged for many years - if timezone info were baked into the STL, it would become outdated.) Having the OS be responsible for providing up-to-date timezone information avoids these issues - it's just that on Windows, we don't have direct access to the IANA timezone database.

We may be able to improve performance here (hence we're keeping the primary issue open), but we don't believe we have good alternatives to ICU at this time.

@StephanTLavavej StephanTLavavej added the duplicate This issue or pull request already exists label Feb 19, 2025
@pps83
Copy link
Contributor Author

pps83 commented Feb 19, 2025

We may be able to improve performance here (hence we're keeping the primary issue open), but we don't believe we have good alternatives to ICU at this time.

Yes, I understand that. If ICU made it into the OS it's the easiest route to make it work
cctz perhaps would be closest low level code to make it work with iana (afaik abseil uses it). Perhaps, building ICU locally would allow look what they do internally, and make cctz work with tz data that lives inside ICU.

First time I tried to use ICU like 20 years ago with boost::regex. After using it, I always avoided it, no particular reason, I don't know why. Very surprised it's now part of the OS itself :)

When I switched to chrono timezone I was expecting something like api-ms-win-crt-timezone-1-2-3.dll in the dependencies. Surprisingly, didn't see anything and was puzzled how it could have worked correctly when 1) WinAPI itself is really bad with timezone data (very limited support for historical info) and 2) binary didn't increase in size while timezone data apparently was present somehow.

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chrono C++20 chrono duplicate This issue or pull request already exists performance Must go faster
Projects
None yet
Development

No branches or pull requests

3 participants