Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace std::endl with the newline char #178

Merged
merged 1 commit into from
Dec 23, 2021

Conversation

dbasgoeze
Copy link
Contributor

This prevents std::flush that std::endl contains, which is in all cases I've checked unnecessary can lead to extreme slowdowns.
Especially there where many chains of std::endl which are dramatically slow on some systems.

This PR was not made with just a regex which is important as it does not remove the std::endl in the FileTest and hopefully does not break the JIT implementation as there was an escaped std::endl which was also replaced with an escaped '\n'.

This prevents std::flush that std::endl contains
@cdehnert
Copy link
Contributor

I‘m not sure about the context for the PR, but have some questions:

  • can you provide data to better understand what kind of slowdowns you are referring to?
  • is it necessary to replace all std::endl occurrences to amend the performance penalties you observed?
  • in a similar spirit, can the missing flush lead to unwanted behaviors such as not seeing some output right away?
  • should we make the line ending a symbol (maybe macro?) to avoid having to change alle files again if, say, the flush is wanted again?

@dbasgoeze
Copy link
Contributor Author

First of all it is known that std::endl can lead to extreme slowdowns especially when used with file io.
This was the case in some examples I've seen.

Example:

int main() {
    std::ofstream s{"Test.txt", std::ios_base::binary};
    for (size_t i{}; i < 100'000'000; ++i)
        s << "This is a test" << std::endl;
    return 0;
}

Took 57.09 seconds while

int main() {
    std::ofstream s{"Test.txt", std::ios_base::binary};
    for (size_t i{}; i < 100'000'000; ++i)
        s << "This is a test\n";
    return 0;
}

took only 1.48 s.

Is it necessary? For fileio yes, for normal cout it depends on the implementation afaik.
What is completely wasted are multiple flushes after another when chaining std::endl.

Yes a missing flush may in some implementation lead to seeing the output a bit later
but cppreference says that many implementations are line buffered and flush on '\n' anyway and therefore std::endl could only harm performance.

The policy to avoid std::endl is in use in many other open and closed source C++ projects.
Which can be seen on cppreference stance and C++ guidelines like this one.

The flush should never be wanted but if it somehow is than this would be an outlier. There are better alternatives like using cerr which never buffers anyway or manually flushing with std::flush which is more verbose and better conveys the intent.
With that and the fact we use '\n' all over the place already I would suggest not using a new symbol.

It is very unfortunate that std::endl is in almost all hello world programs for C++ out there. It really was a bad decision to include the std::flush.

@volkm
Copy link
Contributor

volkm commented Dec 21, 2021

One note: at least for STORM_PRINT, we explicitly flush, see here. So we might indeed not need std::cout here.
Maybe we should ensure flushing for all STORM_LOG_X and switch to \n in general?

@dbasgoeze
Copy link
Contributor Author

dbasgoeze commented Dec 22, 2021

That sounds like a sound plan to me. It would ensures that the LOG call is visible on all implementations and we would not flush on every line break in a call to the logging function.

And with a '\n' everywhere policy it would be easier to spot if we erroneously flush in file io, which is almost always a mistake.

@tquatmann
Copy link
Member

Personally, I find std::endl easier to read and the enforced flushing can sometimes help with debugging. Although the latter can also be achieved by making sure that the STORM_LOG_x will flush as Matthias suggests.

Performance-wise, it only really matters when we are outputting millions of lines (see the example above) and this is almost never the case unless we're doing some kind of explicit export (.drn files, scheduler export, ...).

We should have '\n' where it's needed (mostly that should be file I/O) but I would prefer not enforcing it for "normal" outputs to cout where flushing is not critical.

Regarding this PR: I don't object to merging but I'm also not particularly fond of it. Again, I find std::endl easier to read, so its a matter of personal preference.

@sjunges
Copy link
Contributor

sjunges commented Dec 22, 2021

First, thanks Daniel, for spotting this. In particular better performance on file-io is helpful. I think it is also for the smaller models this may very well shave of quite some percents of the performance and I know plenty of people misusing outputs :-).

Regarding readabilty, cant we just do storm::endl or something? I agree that sometimes a more prominent endl is helpful.

Tim, why is flushing on the console / on std::cout not critical?

@tquatmann
Copy link
Member

@sjunges as I've said, this only has an effect when there are millions of lines of output. If this is what we send to cout, we're doing something wrong, I guess.

Also, for file output, flushing is usually more costly (I guess because one has to access secondary storage). If you change the example of @dbasgoeze from a file stream to std::cout, you'll barely notice a difference between the two versions (although that might depend on the implementation).

The problem I see with storm::endl is that we will have to use it consistently and it is less clear what actually happens. something like `storm::endl_no_flush`` is a bit too long.

For me, the best solution would be to replace only the "critical" file I/O std::endl. However, I can see that those are not easy to find right now and when code readability is the only concern, it's probably not worth the effort.
@dbasgoeze does make a valid point that it's easier to just replace them all.

I'd suggest this compromise:

  • ensure that the STORM_LOG_X macros flush
  • merge this PR
  • but do not introduce a strict "no std::endl policy" for the future.

@dbasgoeze
Copy link
Contributor Author

I've dug into the STORM_LOG_* implementation based on the L3PP_LOG_* macros.
They internally use a ostringstream to buffer the message of a single call and after that they stream it to the configured ostream followed by a flush of the ostream. That means that it is already ensured that these macros flush the stream.

So a std::flush in a STORM_LOG_ERROR will not be passed to the output stream but will instead lead to a flush on the stringstream.
This has still a performance impact even though intuitively a flush of a stringstream should be a noop.
That may be because it needs to check some bookkeeping that the standard requires for flushing a stream.

@tquatmann
Copy link
Member

Thank you very much for looking into this @dbasgoeze

I'm fine with merging now. Any objections?

@sjunges
Copy link
Contributor

sjunges commented Dec 23, 2021

Thank you very much for looking into this @dbasgoeze

I'm fine with merging now. Any objections?

not from my side, thanks everyone

@tquatmann tquatmann merged commit c136f66 into moves-rwth:master Dec 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants