Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Julia exceptions wrapped in Python Exceptions more informative #608

Merged
merged 4 commits into from
Nov 6, 2018

Conversation

galenlynch
Copy link
Contributor

I have added a back-trace to the description of exceptions coming from Julia
that are subsequently wrapped in Python exceptions.

This makes it much easier to find the source of errors in Julia code being
called from Python functions.

Closes #607

@codecov-io
Copy link

codecov-io commented Oct 31, 2018

Codecov Report

Merging #608 into master will decrease coverage by 0.97%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #608      +/-   ##
==========================================
- Coverage   54.09%   53.12%   -0.98%     
==========================================
  Files          19       19              
  Lines        1525     1553      +28     
==========================================
  Hits          825      825              
- Misses        700      728      +28
Impacted Files Coverage Δ
src/exception.jl 86.76% <100%> (ø) ⬆️
src/gc.jl 66.66% <0%> (-6.07%) ⬇️
src/pydates.jl 58.82% <0%> (-3.68%) ⬇️
src/pyoperators.jl 69.56% <0%> (-3.17%) ⬇️
src/io.jl 36.76% <0%> (-2.92%) ⬇️
src/conversions.jl 51.19% <0%> (-1.59%) ⬇️
src/PyCall.jl 59.68% <0%> (-1.45%) ⬇️
src/pybuffer.jl 36.73% <0%> (-0.77%) ⬇️
src/gui.jl 0% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b9cca81...4169c63. Read the comment docs.

@tkf
Copy link
Member

tkf commented Nov 1, 2018

Julia's backtrace can be quite long sometimes so I'm not sure constructing a string all the time is a good idea. I think the right approach is to wrap exception lazily and then generate the exception message in Python side on demand:

class PyCallException(Exception):
    def __init__(self, exception, backtrace):
        self.exception = exception
        self.backtrace = backtrace

   def __str__(self):
       return f"Julia exception: {showerror_string(self.exception)}"

   def print_backtrace(self):
       print(sprint(showerror, self.exception, self.backtrace))


class PyCallKeyError(PyCallException, KeyError):
   ...

where print_backtrace is a non-standard method you need to call manually (e.g., via a debugger or logger).

One thing I'm not sure about my approach is the lifetime of the backtrace returned by catch_backtrace(). It's just a Vector{Union{Ptr{Nothing}, InterpreterIP}} and I wonder how long those pointers are valid.

Another much more simple-minded approach is to just put

@debug "Throwing error in Python" exception=(e, bt)

in pyraise(e). This way, you can just ENV["JULIA_DEBUG"] = "PyCall" to look at the stacktraces.

@galenlynch
Copy link
Contributor Author

I'm pretty naive when it comes to python internals, so I didn't know about error string length restrictions. What limits are there? If there is a hard limit, an alternative is limiting the backtrace length that is contained in the error string. If the concern is one of performance, I would think that generating a string would a small performance hit compared to exception handling.

As it stands, I find the error messages so terse that they're seldom helpful. Is there some other mechanism to include the Julia stack in the python exception?

I like your two suggestions. The first one would not be very useful in my use case, as I normally do not write the python code that is 'consuming' pycall objects, which would make it difficult to call print_backtrace. Your second suggestion seems pretty great, and admirably simple.

@tkf
Copy link
Member

tkf commented Nov 1, 2018

I didn't mean that there is a length limit for the string generated from Python exceptions. I'm not sure if such limit exists (other than RAM).

generating a string would a small performance hit compared to exception handling

When the exception includes objects with a lot of type parameters (like when using ForwardDiff.jl), I find that stacktrace to be huge and very slow to print everything. I don't think you'd want to pay that cost simply because a Julia function throws (say) a KeyError. Furthermore, many Python APIs use exception type as the part of the interface (at least more than Julia, I think). So, the exception message may be ignored and generating it can be totally wasteful. That's why I don't think converting stacktrace to a string all the time is optimal.

I find the error messages so terse that they're seldom helpful.

Yeah, I want to improve it, too.

Is there some other mechanism to include the Julia stack in the python exception?

AFIK Exception.__str__ is the only generic way to do this. Actually, including stacktrace in Exception.__str__ (like you do in this PR) is not so bad since if Exception.__str__ is called then it is likely that Python is printing its own stacktrace anyway.

@tkf
Copy link
Member

tkf commented Nov 1, 2018

It also makes sense to print all stacktraces in the exception stacks JuliaLang/julia#28878 to increase the information in the error message. If we do that we pay additional cost for composing the error message. I think this is another reason to go with lazy/on-demand Exception.__str__.

@galenlynch
Copy link
Contributor Author

Thanks for thinking through this in so much detail!

If I understand you correctly, your solution requires making a whole bunch of new exception classes in python and putting the Julia stacktrace in the __str__ attribute. While this may be the right way of solving the problem, I think it's beyond the scope of what I need and I won't implement it in this PR.

Since your suggestion of adding @debug "Throwing error in Python" exception=(e, bt) in pyraise(e) would satisfy my need of determining where exceptions in Julia are coming from when I am developing some code, then I would be happy to implement it and see how it works as part of this PR. If not I'm happy to close the PR.

@tkf
Copy link
Member

tkf commented Nov 1, 2018

Yeah, my suggestion needs to create a bunch of classes (or, do some abstract base class magic...). ATM I'm just brainstorming the idea. Let's wait for @stevengj since he may have some other opinions/ideas/solutions.

src/exception.jl Outdated
@@ -132,7 +132,7 @@ be called via Python C-API, it tries to not throw at all cost.
function showerror_string(e::T) where {T}
try
io = IOBuffer()
showerror(io, e)
showerror(io, e, catch_backtrace())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried that this isn't valid We should really only call catch_backtrace() in a catch block, otherwise its value may be unpredictable.

Can the caller to showerror_string pass in the backtrace if it is available?

@galenlynch galenlynch force-pushed the informative_error_messages branch from 2040dd9 to 1f3d870 Compare November 4, 2018 21:45
@galenlynch
Copy link
Contributor Author

@stevengj I tried to address your concern, though the solution I came up with isn't the prettiest.

I have changed pyraise to accept an optional backtrace, which will be passed to showerror_string. Since pyraise only calls showerror_string for non-PyError exceptions, the pyraise method accepting PyErrors has a dummy argument so backtraces can be passed to either method, though they will ultimately be ignored if they are paired with a PyError exception.

I also added catch_backtrace() to all calls of pyraise inside a try-catch block in PyCall.jl.

I think these changes result in more verbose and informative error messages from exceptions originating in Julia code, while not calling catch_backtrace in inappropriate contexts. Additionally, if users want to supply a backtrace to pyraise, they will now be able to do so.

I understand that these changes obscure the flow of logic, so I would totally understand if you don't like them. However, my goal has already been achieved with the original commit, which I agree is not a general solution as pyraise is exported and can be called outside of a try-catch block where there may not be a current exception. As such, I might just tweak PyCall's code locally to show a stack trace when I need one, instead of developing a more generally robust solution than the one I've added here.

I have added a backtrace to the description of exceptions coming from Julia
that are wrapped in Python exceptions.

This makes it much easier to find the source of errors in Julia code being
called from Python functions.
Instead of using `catch_backtrace` in `showerror_string`, which may not always
be called from a catch block with a current exception set, I have added an
optional argument to `pyraise` which can be used to pass backtraces to
`showerror_string`.

Backtraces are only used if exception is not a PyError exception. By default, no
backtrace is shown.

All try-catch blocks in PyCall attempt to pass the backtrace for the current
exception to `pyraise`.
@galenlynch galenlynch force-pushed the informative_error_messages branch from 1f3d870 to fdae2be Compare November 5, 2018 13:23
src/callback.jl Outdated
@@ -39,7 +39,7 @@ function _pyjlwrap_call(f, args_::PyPtr, kw_::PyPtr)

return pyreturn(ret)
catch e
pyraise(e)
pyraise(e, catch_backtrace())
Copy link
Member

@stevengj stevengj Nov 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be cleaner to have a @pyraise e macro that inserts the catch_backtrace() call:

"""
    @pyraise e

Throw the exception `e` to Python, attaching a backtrace.  This macro should only be
used in a `catch` block so that `catch_backtrace()` is valid.
"""
macro pyraise(e)
    :(pyraise($(esc(e)), catch_backtrace())
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Added a pyraise macro, as written by @stevengj, which catches backtraces and
calls the pyraise function. This macro should only be used in a catch block.
@stevengj
Copy link
Member

stevengj commented Nov 5, 2018

Overall, this looks good to me.

@tkf
Copy link
Member

tkf commented Nov 5, 2018

@stevengj I'm a bit worried about the overhead for this #608 (comment), since exceptions in Python are considered "lightweight" and rendering backtrace can be time-consuming in Julia sometimes. But we can test if that's the case in the wild by just releasing it... 😈

@galenlynch
Copy link
Contributor Author

galenlynch commented Nov 5, 2018

Given the choice between fast exceptions that are hard to locate, and slower exceptions that tell you exactly where the problem is, I would personally choose the latter. For my use case, hunting down the exception would have been nearly impossible without a backtrace. However, @tkf is right that converting the backtrace into a string takes time:

julia> bt = backtrace()
16-element Array{Union{Ptr{Nothing}, InterpreterIP},1}:
 Ptr{Nothing} @0x00007f0af9dcb176                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0af9f34800                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0af9f34529                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0af9f34eac                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0af9f3562f                                                                                                                                                                         
 Base.InterpreterIP(CodeInfo(
 1%1 = backtrace()                                                       │
 │        bt = %1                                                           │
 └──      return %1                                                         │
), 0x0000000000000000)
 Ptr{Nothing} @0x00007f0af9f360ad                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0af9e012fc                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0af9ddc590                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0ae16ea992                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0af9dcb176                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0ae1d59ac3                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0ae1d59d24                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0af9dcb176                                                                                                                                                                         
 Ptr{Nothing} @0x00007f0af9de6afb                                                                                                                                                                         
 Ptr{Nothing} @0x0000000000000000

julia> @benchmark PyCall.showerror_string(ErrorException("foo"))
BenchmarkTools.Trial: 
  memory estimate:  336 bytes
  allocs estimate:  8
  --------------
  minimum time:     339.516 ns (0.00% GC)
  median time:      345.489 ns (0.00% GC)
  mean time:        399.394 ns (6.74% GC)
  maximum time:     11.166 μs (93.70% GC)
  --------------
  samples:          10000
  evals/sample:     219

julia> @benchmark PyCall.showerror_string(ErrorException("foo"), bt)
BenchmarkTools.Trial: 
  memory estimate:  23.47 KiB
  allocs estimate:  519
  --------------
  minimum time:     9.299 ms (0.00% GC)
  median time:      9.652 ms (0.00% GC)
  mean time:        9.734 ms (0.00% GC)
  maximum time:     14.666 ms (0.00% GC)
  --------------
  samples:          514
  evals/sample:     1

I don't have a sense for how common it is for Python code to rely on exception handling for non-exceptional situations, which would be necessary to determine the real world performance impact of this PR.

Merging this PR does not preclude a future lazy implementation, like @tkf suggested. Right now, passing a back trace to pyraise is optional, and calls to pyraise within PyCall.jl supply that back trace. If and when a better implementation is made, then it would be trivial to change the default behaviour within PyCall.jl.

@tkf
Copy link
Member

tkf commented Nov 6, 2018

I was rather talking about the case that printing backtrace takes a few tens of seconds. I remember it happens in case there are a lot of type parameters. I couldn't find the example now but here is a rather contrived example that takes about a half second:

using ForwardDiff

nthderiv(f, n, x = 0.0) =
    if n == 1
        ForwardDiff.derivative(f, x)
    else
        ForwardDiff.derivative(z -> nthderiv(f, n - 1, z), x)
    end

for n in 1:100
    f = Symbol("f$n")
    g = Symbol("f$(n-1)")
    @eval @noinline $f(x, bang) = $g(x, bang) + 1
end
function f0(n, bang)
    bang && error("bang!")
    return 0
end

ex = try
    nthderiv(x -> f100(x, true), 10)
catch e
    e, catch_backtrace()
end
julia> sizeof(@time sprint(showerror, ex...)) / 2^20
  0.465707 seconds (3.96 M allocations: 225.571 MiB, 6.76% gc time)
11.887667655944824

But this generates a 11 MiB string so maybe there is no wonder why it is slow. Actually, this is much faster than printing it to the terminal (via tmux) which takes more than a minute (at which time I terminated the process). So maybe the slowness I remembered was due to the bottleneck in the terminal. (Although I remember that sometimes it looked like Julia was JIT'ing something)

@tkf
Copy link
Member

tkf commented Nov 6, 2018

FYI Python manual says:

EAFP
Easier to ask for forgiveness than permission. This common Python coding style assumes the existence of valid keys or attributes and catches exceptions if the assumption proves false.
--- https://docs.python.org/3/glossary.html#term-eafp

@stevengj
Copy link
Member

stevengj commented Nov 6, 2018

It seems like showerror should be fixed in Base to truncate the backtrace if it is longer than a couple kB.

I’m not so concerned about performance here at the moment, though. I doubt that people will go to the trouble of calling Julia code from Python except for fairly expensive functions—not for cheap functions like dictionary lookups where the cost of constructing a backtrace is a concern.

@stevengj stevengj merged commit fe67ef2 into JuliaPy:master Nov 6, 2018
@galenlynch galenlynch deleted the informative_error_messages branch November 6, 2018 03:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants