Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-103533: Use pep669 APIs for cprofile #103534

Merged
merged 6 commits into from
May 5, 2023

Conversation

gaogaotiantian
Copy link
Member

@gaogaotiantian gaogaotiantian commented Apr 14, 2023

@gaogaotiantian gaogaotiantian marked this pull request as ready for review April 14, 2023 06:32
@arhadthedev arhadthedev added stdlib Python modules in the Lib dir performance Performance or resource usage labels Apr 14, 2023
@gaogaotiantian
Copy link
Member Author

@ericsnowcurrently how can I declare a global constant table in the module?

@ericsnowcurrently
Copy link
Member

@ericsnowcurrently how can I declare a global constant table in the module?

Is it a new table or is it an existing one you're trying to convert? Does it have objects in it? Is the data actually const?

@gaogaotiantian
Copy link
Member Author

Is it a new table or is it an existing one you're trying to convert? Does it have objects in it? Is the data actually const?

It's a new, pure C, true const table.

static const CallbackTableEntry callback_table[] = {
    {PY_MONITORING_EVENT_PY_START, "_pystart_callback"},
    {PY_MONITORING_EVENT_PY_RESUME, "_pystart_callback"},
    {PY_MONITORING_EVENT_PY_RETURN, "_pyreturn_callback"},
    {PY_MONITORING_EVENT_PY_YIELD, "_pyreturn_callback"},
    {PY_MONITORING_EVENT_PY_UNWIND, "_pyreturn_callback"},
    {PY_MONITORING_EVENT_CALL, "_ccall_callback"},
    {PY_MONITORING_EVENT_C_RETURN, "_creturn_callback"},
    {PY_MONITORING_EVENT_C_RAISE, "_creturn_callback"},
    {0, NULL}
};

@gaogaotiantian
Copy link
Member Author

The situation is almost identical to error_codes[] in Modules/_sqlite/module.c, so I did the same thing - add the variable to ignored.tsv.

@gaogaotiantian
Copy link
Member Author

Hi @markshannon , do you think this is a good candidate for 3.12? We pushed out PEP 699 but none of the standard library actually uses it. The profiling tool change is simpler than the debugging tool, so maybe this could be an example/try out for implementing tools in PEP 669. Potentially we can have more feedbacks for the monitoring mechanism.

@brandtbucher brandtbucher self-requested a review May 2, 2023 20:59
Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can skip creating builtin functions when profiling method descriptors.
Other than that, looks good.

Did you measure performance at all?

Py_INCREF(callable);
return (PyObject*)((PyCFunctionObject *)callable);
}
if (Py_TYPE(callable) == &PyMethodDescr_Type) {
Copy link
Member

@markshannon markshannon May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?
Doesn't the profiler extract the same data from the builtin function that it could from the method descriptor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This piece is copied from the new setprofile I believe. The idea behind it is to make sure get_cfunc_from_callable only returns a PyCFunctionObject. If it's not, then ((PyCFunctionObject *)cfunc)->m_ml won't work.

What do you propose here? Simply return callable? We need to check that anyway because CALL event can be triggered before calling a Python function and we don't want to add profiler entry on that(it should be dealt with later). I did realize that Py_RETURN_NONE was incorrect - NULL should be returned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All that happens to the builtin function objects is it gets passed down to normalizeUserObj() which then does some elaborate lookup to get the method descriptor back again. Both the method descriptor and builtin function contain a pointer to the same PyMethodDef struct.

So, leave the method descriptor alone here, and in normalizeUserObj() create the same string that as would be created for the builtin function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mo object on line 175 is the method descriptor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are three purposes get_cfunc_from_callable needs the serve:

  1. To get the object(callable) to print later, which could be resolved in normalizeUserObj() like you said.
  2. To get a unique key for the hash table(actually a binary tree). In this case, ((PyCFunctionObject *)cfunc)->m_ml was used, maybe the callable itself could be used as well? Not sure why m_ml was chosen at the first place, the builtin functions probably have dinstinct addresses anyway? Even for the builtin methods(like list.append), using (void*)callable directly might work?
  3. To filter out unwanted data. This is why this part is necessary. Without trying to get the C function from the method, how could we know if this is a call to a builtin function that we would like to log? It could be a simple Python call(f()), which would trigger CALL event. It could be a method call, but not to a C function(self.some_method()). We need to filter these entry out before we even reach normalizeUserObj()(that's where we are about to log the function call).

The current implementation is trying to swap out the setprofile layer without touching the internal profiling system. It's true that the profiling system could be optimized, but it's also risky and probably need extra care. If we want to land this in 3.12, maybe we should avoid changing the profiling logic for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your point about risk.

I think it is worth cleaning up the internals. We couldn't do it before, as the conversion from method descriptor to builtin function occurred before cprofile got to see it. Maybe we can get it done for 3.12, maybe not.

So let's get this change in for 3.12, and we can streamline things later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can clean up the internals, but we still need to address point 3. We need to filter out entries that we don't want, and that probably requires resolving the descriptors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the information in the fake builtin function is also in the original method descriptor, so whatever the filter was doing should still work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you suggest that we can get rid of Py_TYPE(callable)->tp_descr_get(callable, self_arg, (PyObject*)Py_TYPE(self_arg)) because we don't need the actual PyCFunctionObject in it? We still need to check against Py_TYPE(callable) == &PyMethodDescr_Type for the actual "builtin methods" right? To filter out Python defined methods?

@gaogaotiantian
Copy link
Member Author

Did you measure performance at all?

The performance measurement was in the original issue #103533, it might not be the most obvious format...

@gaogaotiantian
Copy link
Member Author

I guess we don't have time now to make C level API changes if we want to land this in 3.12, but it would be nice to have MISSING and DISABLE available for C code.

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last question

@@ -25,7 +25,6 @@ def test_bad_counter_during_dealloc(self):
with support.catch_unraisable_exception() as cm:
obj = _lsprof.Profiler(lambda: int)
obj.enable()
obj = _lsprof.Profiler(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why has this line been removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because what the test was partially testing won't work anymore. The line relies on the fact that the original obj was deallocated when replace - that won't happen because it's still being referenced by the monitors.

Having that line would also cause an infinite exception loop, probably due to the new mechanism.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry not infinite, but very frequent warning. The issue is we did not disable the profiling - that's what I mentioned in the description - enabling the profiling with one object and disable it with another won't work.

@markshannon markshannon merged commit b979741 into python:main May 5, 2023
@gaogaotiantian gaogaotiantian deleted the pep669-cprofile branch May 5, 2023 17:56
carljm added a commit to carljm/cpython that referenced this pull request May 5, 2023
* main:
  pythongh-99113: Add PyInterpreterConfig.own_gil (pythongh-104204)
  pythongh-104146: Remove unused var 'parser_body_declarations' from clinic.py (python#104214)
  pythongh-99113: Add Py_MOD_PER_INTERPRETER_GIL_SUPPORTED (pythongh-104205)
  pythongh-104108: Add the Py_mod_multiple_interpreters Module Def Slot (pythongh-104148)
  pythongh-99113: Share the GIL via PyInterpreterState.ceval.gil (pythongh-104203)
  pythonGH-100479: Add `pathlib.PurePath.with_segments()` (pythonGH-103975)
  pythongh-69152: Add _proxy_response_headers attribute to HTTPConnection (python#26152)
  pythongh-103533: Use PEP 669 APIs for cprofile (pythonGH-103534)
  pythonGH-96803: Add three C-API functions to make _PyInterpreterFrame less opaque for users of PEP 523. (pythonGH-96849)
jbower-fb pushed a commit to jbower-fb/cpython-jbowerfb that referenced this pull request May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants