Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timestamps for logs #6374

Closed
nilsnolde opened this issue Sep 23, 2022 · 17 comments
Closed

Timestamps for logs #6374

nilsnolde opened this issue Sep 23, 2022 · 17 comments

Comments

@nilsnolde
Copy link
Contributor

I’d like to add timestamps to the processing logs of the various commands. I didn’t look into the code base yet, not sure if the logging is centralized, so it’d be quick to add, or it needs a logging abstraction.

We had some issues lately where custom PBFs take a whole longer to pass through osrm-contract. Timestamps would help to see in which phase of the pipeline it’s hanging. And it’s general good practice IMO. Shout out if any objections/thoughts.

@SiarheiFedartsou
Copy link
Member

SiarheiFedartsou commented Sep 23, 2022

At glance it looks like it will be as easy as modifying this Log class.

template <typename T> inline Log &operator<<(const T &data)

We can try https://github.com/fmtlib/fmt to implement timestamp formatting btw.

@nilsnolde
Copy link
Contributor Author

Thanks, that was what I hoping for:)

@SiarheiFedartsou
Copy link
Member

Here is draft #6375
How it looks now for osrm-extract:
Screenshot 2022-09-23 at 16 25 16

I still see some logs which have no timestamps though and also not sure we need one more dependency because of it, but fmtlib is nice library(and actually is the same as std::format from C++20)

@SiarheiFedartsou
Copy link
Member

I still see some logs which have no timestamps though

This ok, after is the "end" of Collecting traffic signal information on 2292 signals....

Screenshot 2022-09-23 at 16 35 53

We use UnbufferedLog to output it

util::UnbufferedLog log;

I.e. if some other logs are happening in the middle(like these warnings in this case) they are "interleave" logs coming from this UnbufferedLog, tbh I'd stop using this.

@SiarheiFedartsou
Copy link
Member

Current log system is prone to race conditions, i.e. if we are doing something like:

util::Log() << "Hello, " << "OSRM";

we have no guarantee that log will be outputted "atomically", instead we may have it interleaved by some another log happening in another thread.

What about refactoring it a bit to something like:

util::Log::info("{} {}", "Hello, ", "OSRM");

This way we will be able to guarantee that logs are "atomic".

@mjjbell @nilsnolde would appreciate your opinion here

@nilsnolde
Copy link
Contributor Author

nilsnolde commented Sep 23, 2022

I.e. if some other logs are happening in the middle(like these warnings in this case) they are "interleave" logs coming from this UnbufferedLog, tbh I'd stop using this.

Yeah, I agree, those could easily be refactored to use standard logging. No need for the "ok, after.. " bit if we have timestamps IMO (if it's not ok, it'll throw right). What doesn't take more than a second is not worth measuring time on anyways;)

What about refactoring it a bit to something like:
I'd put it in an issue, unless you really want to tackle it now:) It's true, but hasn't ever been reported or observed I think?

Thanks a ton for this @SiarheiFedartsou ! I'll immediately try your branch on our PBFs!

@nilsnolde
Copy link
Contributor Author

Or we report down to milliseconds? Could do "22-09-23" instead to save some space?

@SiarheiFedartsou
Copy link
Member

SiarheiFedartsou commented Sep 23, 2022

Or we report down to milliseconds? Could do "22-09-23" instead to save some space?

Not sure I got it, but timestamps I output in this PR are precise up to second.

@nilsnolde
Copy link
Contributor Author

nilsnolde commented Sep 26, 2022

Sorry that wasn't too clearly worded..

What I meant: we can get rid of UnbufferedLog in multi-threaded functions, I agree. There's some spots where it makes sense, e.g. when printing progress (in %) in single-threaded processes.

But all the instances where it's printing it's busy with some processing and then later it adds "ok, after .." on the same line and reports the duration down to nanoseconds, they can be removed IMO (esp if they're running in multiple threads). Then, if we add timestamps which go down to milliseconds, I think that's all the resolution one needs (and we can add another line saying
<whatever_process> done if we wanna be explicit).

What do you think @SiarheiFedartsou ? I'd like to help out, but got a weird error building tbb with ENABLE_CONAN=ON..

[cmake] [ 58%] Building CXX object src/tbb/CMakeFiles/tbb.dir/observer_proxy.cpp.o
[cmake] [ 60%] Linking CXX shared library ../../gnu_12.2_cxx11_64_release/libtbbbind_2_4.so
[cmake] /usr/bin/ld: error: /usr/lib: read: Is a directory
[cmake] collect2: error: ld returned 1 exit status

I'm on Arch Linux, gcc 12.2.0, ld 2.39. Any clue?

EDIT: with clang 14.0.6 it fails with a similar but slightly different way:

[cmake] [ 58%] Building CXX object src/tbb/CMakeFiles/tbb.dir/private_server.cpp.o
[cmake] [ 60%] Linking CXX shared library ../../clang_14.0_cxx11_64_release/libtbbbind_2_4.so
[cmake] [ 63%] Building CXX object src/tbb/CMakeFiles/tbb.dir/profiling.cpp.o
[cmake] /usr/bin/ld: read in flex scanner failed
[cmake] clang-14: error: linker command failed with exit code 1 (use -v to see invocation)

I also have the issue with ld not linking properly on Arch, I can try again with Ubuntu 22.04 later.

@SiarheiFedartsou
Copy link
Member

I also have #6219 (comment) with ld not linking properly on Arch, I can try again with Ubuntu 22.04 later.

yeah, I would try on Ubuntu. I’ll probably try to build it in some Arch-based docker image one day - may be will find something.

And, yeah, agree about your “nanoseconds” point :)

@nilsnolde
Copy link
Contributor Author

It's ok for me, there's now and then projects where I have to Ubuntu instead, I have it in dual-boot. But thanks for the offer. I'm really happy to see you pushing things again with @mjjbell !

There weren't many occurrences of UnbufferedLog where it's problematic. In fact, you found the only user-facing one:) Others will be mangled with Debug Logs in a very similar way, but IMO Debug is fine. Not that it's a huge deal anyways, I did a few changes here based on your branch: https://github.com/Project-OSRM/osrm-backend/compare/sf-log-timestamp...gis-ops:osrm-backend:nn-log-timestamp-cleanup?expa

Milliseconds is more trouble than it's worth. The C++20 spec says it should print milliseconds when using :%S format specifier, but fmt doesn't (yet). Since we're keeping most timers it's fine to only display seconds resolution IMO.

Regarding this:

we have no guarantee that log will be outputted "atomically", instead we may have it interleaved by some another log happening in another thread.

The Log() seems to be guarded by a std::lock_guard. Or is the problem the streaming? I might misunderstand smth basic here 😅

@SiarheiFedartsou
Copy link
Member

The Log() seems to be guarded by a std::lock_guard. Or is the problem the streaming? I might misunderstand smth basic here

Yep, I am a bit wrong here. In case of util::Log there is no problem, because we have mutex here

std::lock_guard<std::mutex> lock(get_mutex());

But UnbufferedLog writes directly to std::cout/std::cerr as I can see, that's why we may have this problem I mentioned.

@nilsnolde
Copy link
Contributor Author

I checked all usages of UnbufferedLog, it's not used in multi-threaded functions AFAICT. I only changed the one occurrence in traffic signals, where it was mangled with warnings. Others looked fine.

If you want you can implement my changes, for me this PR looks good either way:)

@SiarheiFedartsou
Copy link
Member

SiarheiFedartsou commented Sep 27, 2022

I updated PR to also output fractional part of second. Implementation is a bit ugly on my taste, but it works!

 const auto timestamp = std::chrono::system_clock::now();
            return fmt::format(
                "{}[{:%FT%H:%M:}{:%S}] [{}] ", is_terminal ? color : "", timestamp, timestamp.time_since_epoch(), level);

Screenshot 2022-09-27 at 20 32 41

@SiarheiFedartsou
Copy link
Member

Going to merge #6375 soon.

I only changed the one occurrence in traffic signals, where it was mangled with warnings.

btw IMO this is the main problem with UnbuferredLog - you never know if methods you are calling will log something or not. That's why I'd propose to get rid of it completely(at least I don't see any benefits of it)

@nilsnolde
Copy link
Contributor Author

The benefit AFAICT is to print stuff on the same line, eg when reporting progress in percent, which is arguably nice for UX. It just the logs fairly concise. I’d personally keep it, but have not a very strong opinion about it.

@SiarheiFedartsou
Copy link
Member

Closing as #6375 was merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants