Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lwt_ppx backtraces with reraise #554

Merged
merged 3 commits into from
Mar 5, 2018
Merged

Conversation

gabelevi
Copy link
Contributor

@gabelevi gabelevi commented Mar 3, 2018

As others have noted, backtraces with lwt_ppx seem to be broken. I've seen it mentioned in a few places, but https://caml.inria.fr/mantis/print_bug_page.php?bug_id=7375 is the place that inspired me to see if switching from raise exn to reraise exn would fix the issue.

There were a few small challenges:

  1. Putting external reraise : exn -> 'a = "%reraise" in lwt.ml and passing it into add_loc made the re-raise location always show up in lwt.ml. So I decided to make ppx_lwt.ml put it into the transformed file
  2. The external part of the grammar is a little tricky to stick inside of an expression, since it's basically a module item. Sticking it into a local module is kind of a hack, but seems to work.

The %reraise primitive seems to be new to 4.02 so I probably should add available: [ocaml-version >= "4.02.0"] to the lwt_ppx.opam, right? Would that be ok? I suppose I could also use cppo to decay to raise for older versions, if you feel strongly.

Fixes #498 #171

@gabelevi
Copy link
Contributor Author

gabelevi commented Mar 3, 2018

To test this out I created a test file and ran Ocaml 4.03/4.05 with and without this change. Threw together my results in https://github.com/gabelevi/lwt_ppx_backtrace_test

@aantron
Copy link
Collaborator

aantron commented Mar 3, 2018

Thanks! I will review this now, but to immediately answer the question about 4.02: as I understand it, PPX already requires 4.02, and we pick this constraint up by depending on ocaml-migrate-parsetree (previously by depending on ppx_tools). So, I think nothing needs to be done about this bit :)

@aantron
Copy link
Collaborator

aantron commented Mar 3, 2018

Actually, I guess one could argue that since reraise has nothing to do with PPX, we should add our own additional constraint. I will leave it up to you then.

@gabelevi
Copy link
Contributor Author

gabelevi commented Mar 3, 2018

@samwgoldman was concerned about the perf cost of those local modules. I'm not really sure how expensive they are. I suppose they probably could be moved inside of the add_loc callbacks. There might also be some more clever solution too 😁

@aantron
Copy link
Collaborator

aantron commented Mar 3, 2018

The local module thing should be fine, I've also (ab)used it a lot. We could always run a small benchmark to see if they get optimized away - I suspect the compiler handles them intelligently.

@aantron
Copy link
Collaborator

aantron commented Mar 3, 2018

Let me know if you'd like me to look into it :)

{pstr_desc = Pstr_value (Nonrecursive, vbs); _}]), _); _} ->
mapper.structure_item mapper (Str.value Nonrecursive (gen_top_binds vbs))
| x -> default_mapper.structure_item mapper x);
structure_item = (fun mapper stri ->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't mind, I'll split this whitespace change out into a separate commit before merging. You don't have to do anything (unless you really want to), I'm happy to take care of it.

@aantron
Copy link
Collaborator

aantron commented Mar 3, 2018

(The AppVeyor failure is not related to this PR)

@aantron
Copy link
Collaborator

aantron commented Mar 3, 2018

Trying the tests, I get this output on native 4.05.0 with the improved version of the PPX:

Lwt stack trace with sleep:


Lwt stack trace without sleep:
Raised at file "pervasives.ml", line 32, characters 17-33
Called from file "foo.ml", line 1, characters 14-29
Called from file "foo.ml", line 27, characters 33-46
Called from file "foo.ml", line 31, characters 15-28
Called from file "foo.ml", line 35, characters 20-33
Called from file "foo.ml", line 37, characters 33-46
Called from file "src/core/lwt.ml", line 2222, characters 16-20

and replacing failwith by a direct raise Exit, I get no stack traces at all in either case. Do you know, by any chance, what circumstances produce this behavior?

I am running with OCAMLRUNPARAM=b and I added Printexc.record_backtraces true for good measure.

@gabelevi
Copy link
Contributor Author

gabelevi commented Mar 4, 2018

Hmm...I wonder what's different between our setups. Initially I only tested on my OSX 10.13 laptop, but I just tested on a CentOS 7 machine and it also works the same. I pushed an update to my test repo so that make test.native also runs the binary. So for example, on Linux, this was my output

I also tried replacing the failwith "Boom" with raise Exit and I didn't see a change. Adding Printexc.record_backtraces true and dropping OCAMLRUNPARAM=b also shows the same stack traces for me.

Here's what I have installed via opam


And I can totally revert the whitespace changes!

@aantron
Copy link
Collaborator

aantron commented Mar 4, 2018

I'm pretty sure it's something wrong with my setup, as I don't remember the backtraces ever being as bad as I have now when I was also using OSX (I tried small test programs outside of Lwt, and got some pretty weird behavior).

Everything else with the PR looks good :) I'll look into my problems a bit, then merge the PR. If my local problems take too long to figure out, I'll merge without waiting. Sorry to be a bit of a blocker here.

@gabelevi
Copy link
Contributor Author

gabelevi commented Mar 4, 2018

You could try installing another version of opam into a different directory.

$ export OPAMROOT="$(shell mktemp -d 2> /dev/null || mktemp -d -t opam)"
$ opam init --comp 4.05.0
$ opam install lwt lwt_ppx
$ eval `opam config env`
$ make test.native
$ opam pin add lwt_ppx <path/to/lwt>
$ make test.native
# To clean up

$ unset OPAMROOT
$ eval `opam config env`

Don't forget to close the terminal or unset $OPAMROOT & return eval, or you'll continue using the temp version of opam 😛

This is what I see if I run those steps.

@gabelevi gabelevi mentioned this pull request Mar 5, 2018
@aantron aantron merged commit 0bab939 into ocsigen:master Mar 5, 2018
@aantron
Copy link
Collaborator

aantron commented Mar 5, 2018

Ok, thanks a lot for this!

The problem with "my setup" turned out to be that I started by using ocamlfind directly rather than the Makefile you kindly provided, and forgot to add -g. Ridiculous error on my part, sorry about that.

I ran some performance tests on the local module. There is no noticeable difference between that and an "ordinary" module. I dumped the assembly and it looks like the local module would actually have a miniscule advantage. An "ordinary" module incurs two extra mov instructions (without Flambda).

@aantron
Copy link
Collaborator

aantron commented Mar 5, 2018

Also, speculating and summarizing for future reference, I understand that the code the PPX generated before this PR, using ordinary raise, used to trigger the re-raise mechanism in older versions of OCaml, but stopped doing so at some point due to changes in OCaml. From the linked https://caml.inria.fr/mantis/print_bug_page.php?bug_id=7375:

The logic to distinguish "raise" for "reraise" has changed in recent OCaml versions. There is no clear specification, so it is not easy to say what is simply unfortunate and what is an actual bug.

Fortunately, the explicit %reraise was added in 4.02, and this PR made the PPX use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Do we still want the Lwt.backtrace_* functions if backtraces don't work?
2 participants