Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support 3.12 #246

Merged
merged 13 commits into from
Nov 2, 2023
Merged

Support 3.12 #246

merged 13 commits into from
Nov 2, 2023

Conversation

Erotemic
Copy link
Member

@Erotemic Erotemic commented Oct 6, 2023

Updating CI to test Python 3.12

@Erotemic
Copy link
Member Author

Erotemic commented Oct 7, 2023

@Theelx Having a bit of trouble with this if you have some time to take a look. In 3.12 I'm getting an error when compiling:

      Error compiling Cython file:
      ------------------------------------------------------------
      ...
      
          def __exit__(self, exc_type, exc_val, exc_tb):
              self.disable_by_count()
      
          def enable(self):
              PyEval_SetTrace(python_trace_callback, self)
                              ^
      ------------------------------------------------------------
      
      line_profiler/_line_profiler.pyx:311:24: Cannot assign type 'int (object, PyFrameObject *, int, PyObject *) except? -1' to 'Py_tracefunc'. Exception values are incompatible. Suggest adding 'noexcept' to type 'int (object, PyFrameObject *, int, PyObject *) except? -1'.

Which looks like it is due to the types in the function:

@cython.boundscheck(False)
@cython.wraparound(False)
cdef int python_trace_callback(object self_, PyFrameObject *py_frame, int what,
                               PyObject *arg):

If I add noexcept at the end of the declaration it works on 3.12, but fails on 3.11 (also some of the tests fail), i.e.

@cython.boundscheck(False)
@cython.wraparound(False)
cdef int python_trace_callback(object self_, PyFrameObject *py_frame, int what,
                               PyObject *arg) noexcept:

On 3.11 if I use @cython.exceptval(check=False) (which IIUC) should be equivalent according to https://cython.readthedocs.io/en/latest/src/userguide/language_basics.html#error-return-values

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.exceptval(check=False)
cdef int python_trace_callback(object self_, PyFrameObject *py_frame, int what,
                               PyObject *arg):

But this seems to fail again on 3.12.

From what I understand, we need to define a function with the same signature as Py_tracefunc. I'm not fluent in C symantics, so I don't know if a naked function like that is implicity "no except" or "possible except", but I assume it's a noexcept because it's integer return code is indicating the error state. I'm not sure why this worked in previous versions, perhaps cython got more restrictive about function signatures matching exception types? I'm also wondering if there is a condition that could cause an exception in that function and then require us to return -1 (on a quick glance it looks like there are key errors that could happen).

@Theelx
Copy link
Collaborator

Theelx commented Oct 7, 2023

on my phone atm but for the sake of a quick response, this is the same issue we got when going from cython 3.0a12 to b1

1 similar comment
@Theelx
Copy link
Collaborator

Theelx commented Oct 7, 2023

on my phone atm but for the sake of a quick response, this is the same issue we got when going from cython 3.0a12 to b1

@Theelx
Copy link
Collaborator

Theelx commented Oct 15, 2023

So I'm looking into why at the moment, but #203 doesn't have these issues with 3.12. Hopefully I can isolate what's causing the errors with 3.12 from that fork.

@Theelx
Copy link
Collaborator

Theelx commented Oct 15, 2023

@Erotemic I believe I just found the issue. Add "legacy_implicit_noexcept": True, to the compiler_directives in setup.py, line 231. After this, you can remove the exceptval=False from the callback in the .pyx and you can also remove the other directives that were added at the top of the .pyx file.

@Erotemic
Copy link
Member Author

Keeping the directives at the top of the pyx file lets you use the standalone cythonize tool instead of needing to invoke the entire setup.py, which I think is nice for developing.

It probably also makes sense to indent to remove the legacy behavior and explicitly mark noexcept functions as such, so I' like to keep the directive as well unless it causes issues.

The fix does allow me to compile on 3.12, but I'm still getting test failures I'll need to look into. For reference the failing tests are:

FAILED tests/test_explicit_profile.py::test_explicit_profile_with_in_code_enable - AssertionError: assert 'Function: func2' in 'Timer unit: 1e-09 s\n\n  0.00 seconds - /tmp/tmpk326uqo9/script.py:9 - func2\n  0.00 seconds - /tmp/tmpk326uqo9/script.py:21 - fun...
FAILED tests/test_explicit_profile.py::test_explicit_profile_with_duplicate_functions - AssertionError: assert 'Function: func1' in 'Timer unit: 1e-09 s\n\n  0.00 seconds - /tmp/tmpij5_1ajc/script.py:3 - func1\n  0.00 seconds - /tmp/tmpij5_1ajc/scrip...unc2\n  0....
FAILED tests/test_ipython.py::TestIPython::test_init - AssertionError: 0 != 1

@Theelx
Copy link
Collaborator

Theelx commented Oct 16, 2023

@Erotemic For getting rid of the Cython deprecation warnings which happen with the legacy implicit noexcept option, you can try putting extern after cdef in the function definition, and taking out the decorator:

@cython.boundscheck(False)
@cython.wraparound(False)
cdef extern int python_trace_callback(object self_, PyFrameObject *py_frame, int what,
                               PyObject *arg):

It works for me on 3.12, and my guess is it'll work on 3.11 though I haven't tried it yet.

EDIT: This does not seem to fix the py.test tests, but I'm unsure why. It works fine for some personal scripts that I commonly use it to test for potential perf improvements.

@Erotemic
Copy link
Member Author

Using extern and no-decorator seem to compile just as well.

I think the test failures must have to do with something in 3.12 itself, and not any Cython changes. Most other tests run fine. It's two tests for the new explicit profiler, and 1 test for the IPython magic that are the problem.

Running pytest -k with_in_code_enable -s on 3.11 gives:

tests/test_explicit_profile.py Timer unit: 1e-09 s

  0.00 seconds - /tmp/tmp9sqjysht/script.py:21 - func4
  0.00 seconds - /tmp/tmp9sqjysht/script.py:9 - func2
Wrote profile results to custom_output.txt
Wrote profile results to custom_output_2023-10-16T110436.txt
Wrote profile results to custom_output.lprof
To view details run:
python -m line_profiler -rtmz custom_output.lprof

Finished running script
Contents of /tmp/tmpitjiehix/custom_output.txt
Timer unit: 1e-09 s

Total time: 1.47e-07 s
File: /tmp/tmp9sqjysht/script.py
Function: func4 at line 21

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    21                                           @profile
    22                                           def func4(a):
    23         1        147.0    147.0    100.0      return a + 1

And on 3.12 it fails:

tests/test_explicit_profile.py Timer unit: 1e-09 s

  0.00 seconds - /tmp/tmpjo8nqdo6/script.py:9 - func2
  0.00 seconds - /tmp/tmpjo8nqdo6/script.py:21 - func4
Wrote profile results to custom_output.txt
Wrote profile results to custom_output_2023-10-16T110444.txt
Wrote profile results to custom_output.lprof
To view details run:
python -m line_profiler -rtmz custom_output.lprof

Finished running script
Contents of /tmp/tmpxxdv1a41/custom_output.txt
Timer unit: 1e-09 s

  0.00 seconds - /tmp/tmpjo8nqdo6/script.py:9 - func2
  0.00 seconds - /tmp/tmpjo8nqdo6/script.py:21 - func4

F

For the IPython test running pytest -k test_init -s -v the difference is in 3.11:

tests/test_ipython.py::TestIPython::test_init Timer unit: 1e-09 s

Total time: 3.51e-07 s
File: <ipython-input-1-ae24564ca1b8>
Function: func at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           def func():
     2         1        351.0    351.0    100.0      return 2**20
func_data=('<ipython-input-1-ae24564ca1b8>', 1, 'func')
lines_data=[(2, 1, 351)]

In 3.12:

tests/test_ipython.py::TestIPython::test_init Timer unit: 1e-09 s

Total time: 0 s
File: <ipython-input-1-ae24564ca1b8>
Function: func at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           def func():
     2                                               return 2**20
func_data=('<ipython-input-1-ae24564ca1b8>', 1, 'func')
lines_data=[]

@Theelx
Copy link
Collaborator

Theelx commented Oct 17, 2023

Using extern and no-decorator seem to compile just as well.

That's true, but the extern method is not deprecated, whereas the legacy compiler directive is, and iirc the decorator fails in pre-3.0 versions of Cython. Regardless, I agree that the test failures have something to do with 3.12,so I'll look into that.

@Theelx
Copy link
Collaborator

Theelx commented Oct 17, 2023

@Erotemic I wonder if maybe the testing harness is the issue? I just ran with_code_enable in its own file and it works fine.
Screenshot from 2023-10-17 13-21-11

@Erotemic
Copy link
Member Author

While pytest has caused issues for me in the past, I don't think it is the culprit here:

Running outside of pytest via:

python -m unittest tests/test_ipython.py
xdoctest tests/test_explicit_profile.py test_explicit_profile_with_in_code_enable
xdoctest tests/test_explicit_profile.py test_explicit_profile_with_duplicate_functions 

still results in the same errors.

I also started up an IPython shell and ran the tests directly, and they also failed.

@Erotemic
Copy link
Member Author

@Theelx is there anything in this minimal working example that looks odd to you:

def test_simple_explicit_nonglobal_usage():
    from line_profiler import LineProfiler
    import ubelt as ub
    profiler = LineProfiler()

    def func(a):
        return a + 1

    profiled_func = profiler(func)

    # Run Zero Times
    lstats = profiler.get_stats()
    print('lstats.timings = {}'.format(ub.urepr(lstats.timings, nl=1)))
    print('lstats.unit = {}'.format(ub.urepr(lstats.unit, nl=1)))
    print(f'profiler.code_hash_map={profiler.code_hash_map}')
    profiler.print_stats()

    # Run Once
    profiled_func(1)

    lstats = profiler.get_stats()
    print('lstats.timings = {}'.format(ub.urepr(lstats.timings, nl=1)))
    print('lstats.unit = {}'.format(ub.urepr(lstats.unit, nl=1)))
    print(f'profiler.code_hash_map={profiler.code_hash_map}')
    profiler.print_stats()

    # Run Multiple Times
    for _ in range(10):
        profiled_func(1)

    lstats = profiler.get_stats()
    print('lstats.timings = {}'.format(ub.urepr(lstats.timings, nl=1)))
    print('lstats.unit = {}'.format(ub.urepr(lstats.unit, nl=1)))
    print(f'profiler.code_hash_map={profiler.code_hash_map}')
    profiler.print_stats()

test_simple_explicit_nonglobal_usage()

In 3.11 it reports time correctly, but in 3.12 it is like the profiled function was never called. I'm not sure how anything can work if this simple case is failing.

On 3.12:

lstats.timings = {
    ('/home/joncrall/code/line_profiler/tests/test_explicit_profile.py', 13, 'func'): [],
}
lstats.unit = 1e-09
profiler.code_hash_map={<code object func at 0x7f002870e590, file "/home/joncrall/code/line_profiler/tests/test_explicit_profile.py", line 13>: [1917141407524353302, 1917141407524353301]}
Timer unit: 1e-09 s

Total time: 0 s
File: /home/joncrall/code/line_profiler/tests/test_explicit_profile.py
Function: func at line 13

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    13                                               def func(a):
    14                                                   return a + 1

On 3.11:

lstats.timings = {
    ('/home/joncrall/code/line_profiler/tests/test_explicit_profile.py', 13, 'func'): [(14, 11, 2856)],
}
lstats.unit = 1e-09
profiler.code_hash_map={<code object func at 0x7f896e352a70, file "/home/joncrall/code/line_profiler/tests/test_explicit_profile.py", line 13>: [-4000377494778487241, -4000377494778487244]}
Timer unit: 1e-09 s

Total time: 2.856e-06 s
File: /home/joncrall/code/line_profiler/tests/test_explicit_profile.py
Function: func at line 13

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    13                                               def func(a):
    14        11       2856.0    259.6    100.0          return a + 1

@Theelx
Copy link
Collaborator

Theelx commented Oct 29, 2023

One thing that looks odd is that 3.11 shows the function as taking ~260us per hit, where it should take ~1us. I assume this is ubelt overhead, so my question is can this be replicated without using ubelt?

@Erotemic
Copy link
Member Author

Erotemic commented Oct 29, 2023

Yes, ubelt is not necessary. It's also not the test harness. Here is a shell script that reproduces the example explicitly:

echo "
def test_simple_explicit_nonglobal_usage():
    from line_profiler import LineProfiler
    profiler = LineProfiler()

    def func(a):
        return a + 1

    profiled_func = profiler(func)

    # Run Zero Times
    lstats = profiler.get_stats()
    print(f'lstats.timings={lstats.timings}')
    print(f'lstats.unit={lstats.unit}')
    print(f'profiler.code_hash_map={profiler.code_hash_map}')
    profiler.print_stats()

    # Run Once
    profiled_func(1)

    lstats = profiler.get_stats()
    print(f'lstats.timings={lstats.timings}')
    print(f'lstats.unit={lstats.unit}')
    print(f'profiler.code_hash_map={profiler.code_hash_map}')
    profiler.print_stats()

    # Run Multiple Times
    for _ in range(10):
        profiled_func(1)

    lstats = profiler.get_stats()
    print(f'lstats.timings={lstats.timings}')
    print(f'lstats.unit={lstats.unit}')
    print(f'profiler.code_hash_map={profiler.code_hash_map}')
    profiler.print_stats()

test_simple_explicit_nonglobal_usage()
" > mwe.py

python mwe.py

In 3.12 the last print block is:

lstats.timings={('/home/joncrall/tmp/mwe.py', 6, 'func'): []}
lstats.unit=1e-09
profiler.code_hash_map={<code object func at 0x7f76c730e320, file "/home/joncrall/tmp/mwe.py", line 6>: [4641089316629750693, 4641089316629750692]}
Timer unit: 1e-09 s

Total time: 0 s
File: /home/joncrall/tmp/mwe.py
Function: func at line 6

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     6                                               def func(a):
     7                                                   return a + 1

In 3.11 the last print block is:

lstats.timings={('/home/joncrall/tmp/mwe.py', 6, 'func'): [(7, 11, 2328)]}
lstats.unit=1e-09
profiler.code_hash_map={<code object func at 0x7fb0961f56f0, file "/home/joncrall/tmp/mwe.py", line 6>: [6762933698279728574, 6762933698279728575]}
Timer unit: 1e-09 s

Total time: 2.328e-06 s
File: /home/joncrall/tmp/mwe.py
Function: func at line 6

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     6                                               def func(a):
     7        11       2328.0    211.6    100.0          return a + 1

The time seems to be about the same (although if I run for many more iterations, it goes to about 105), but isn't the unit in nanoseconds not microseconds?. I find myself easilly confused when dealing with units, so I typically like to write it out explicitly in pint, and then check that all conversions make sense.

import pint
import ubelt as ub
ureg = pint.UnitRegistry()

s = ureg.seconds
ns = ureg.parse_expression('1 nanosecond')
μs = ureg.parse_expression('1 microsecond')


quantities = {}
quantities['prog_total'] = 2.3e-6 * ureg.seconds
quantities['line_total'] = 2328 * ns
quantities['perhit'] = 211 * ns

nhits = 11
quantities['line_total_derived'] = quantities['perhit'] * nhits

quantities = {k: v.to(μs) for k, v in quantities.items()}

print('quantities = {}'.format(ub.urepr(quantities, nl=1, align=':')))

This prints

quantities = {
    'prog_total'        : <Quantity(2.3, 'microsecond')>,
    'line_total'        : <Quantity(2.328, 'microsecond')>,
    'perhit'            : <Quantity(0.211, 'microsecond')>,
    'line_total_derived': <Quantity(2.321, 'microsecond')>,
}

which seems consistent to me when the numbers are interpreted as nanoseconds (which is 1e-9 as the timer unit reports).

@notEvil
Copy link

notEvil commented Oct 30, 2023

Maybe related: on 3.12.0 I see PyCode_Addr2Line and PyFrame_GetLineNumber returning -1 for some offsets/frames (lasti). Couldn't find any official information if this is intended.

edit: only in async functions, so doesn't explain the mwe which I'm unable to reproduce. The example creating -1:

import asyncio


@profile
async def function_1(depth):
    if (0 < depth):
        ([_ async for _ in function_2(depth - 1)])


async def function_2(depth):
    if (0 < depth):
        yield ()

    if (False):
        yield


async def async_main():
    async with asyncio.TaskGroup() as task_group:
        task_group.create_task(function_1(2))


asyncio.run(async_main())

@Theelx
Copy link
Collaborator

Theelx commented Nov 2, 2023

which seems consistent to me when the numbers are interpreted as nanoseconds (which is 1e-9 as the timer unit reports).

Ah yeah that's true. I forgot that Linux/BSD platforms use units of 1e-8 seconds instead of 1e-6 which used to be the default for all platforms.

I am genuinely confused as to how this is happening, as it works fine for me when I install it and run it locally in the terminal on Ubuntu 22.04, with Python 3.12 behind pyenv.

@Erotemic
Copy link
Member Author

Erotemic commented Nov 2, 2023

@notEvil You are unable to reproduce the MWE in this post? So when you run that in 3.12 you get times? Can you paste your output? What environment are you running in?

In my execution it's a fresh pyenv 3.12.0 build on Ubuntu 22.04.

@notEvil
Copy link

notEvil commented Nov 2, 2023

Just tried again and I'm able to reproduce now. Replacing .f_lineno with PyFrame_GetLineNumber seems to fix it. I probably tested with those changes and paid no attention.

diff --git a/line_profiler/_line_profiler.pyx b/line_profiler/_line_profiler.pyx
index f11b6b5..bfd5fbb 100644
--- a/line_profiler/_line_profiler.pyx
+++ b/line_profiler/_line_profiler.pyx
@@ -89,6 +89,8 @@ cdef extern from "Python.h":
     cdef int PyTrace_C_EXCEPTION
     cdef int PyTrace_C_RETURN
 
+    cdef int PyFrame_GetLineNumber(PyFrameObject *frame)
+
 cdef extern from "timers.c":
     PY_LONG_LONG hpTimer()
     double hpTimerUnit()
@@ -425,7 +427,7 @@ cdef extern int python_trace_callback(object self_, PyFrameObject *py_frame,
     if what == PyTrace_LINE or what == PyTrace_RETURN:
         # Normally we'd need to DECREF the return from get_frame_code, but Cython does that for us
         block_hash = hash(get_frame_code(py_frame))
-        code_hash = compute_line_hash(block_hash, py_frame.f_lineno)
+        code_hash = compute_line_hash(block_hash, PyFrame_GetLineNumber(py_frame))
         if self._c_code_map.count(code_hash):
             time = hpTimer()
             ident = threading.get_ident()
@@ -440,7 +442,7 @@ cdef extern int python_trace_callback(object self_, PyFrameObject *py_frame,
             if what == PyTrace_LINE:
                 # Get the time again. This way, we don't record much time wasted
                 # in this function.
-                self._c_last_time[ident][block_hash] = LastTime(py_frame.f_lineno, hpTimer())
+                self._c_last_time[ident][block_hash] = LastTime(PyFrame_GetLineNumber(py_frame), hpTimer())
             elif self._c_last_time[ident].count(block_hash):
                 # We are returning from a function, not executing a line. Delete
                 # the last_time record. It may have already been deleted if we

Copy link

codecov bot commented Nov 2, 2023

Codecov Report

Merging #246 (dd29898) into main (075b5c4) will not change coverage.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #246   +/-   ##
=======================================
  Coverage   53.36%   53.36%           
=======================================
  Files          11       11           
  Lines         832      832           
  Branches      168      168           
=======================================
  Hits          444      444           
  Misses        329      329           
  Partials       59       59           
Files Coverage Δ
line_profiler/__init__.py 100.00% <100.00%> (ø)
line_profiler/line_profiler.py 50.00% <100.00%> (ø)

... and 1 file with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 075b5c4...dd29898. Read the comment docs.

@Erotemic
Copy link
Member Author

Erotemic commented Nov 2, 2023

@notEvil looks like that fixed it!

I was wondering why this is the case and I found this: https://bugs.python.org/issue42823#msg384502 where Mark Shannon states:

It looks like you are accessing the c field "f_lineno" directly. That was never guaranteed to work, it just happened to. You should use PyFrame_GetLineNumber()

So using the official C-API to get the line number is probably the correct thing to be doing.

@Erotemic Erotemic merged commit 365feb1 into main Nov 2, 2023
74 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants