Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

White space causes inconsistent frame traversal in With blocks #121865

Closed
dmadisetti opened this issue Jul 16, 2024 · 19 comments
Closed

White space causes inconsistent frame traversal in With blocks #121865

dmadisetti opened this issue Jul 16, 2024 · 19 comments
Labels
type-bug An unexpected behavior, bug, or error

Comments

@dmadisetti
Copy link

dmadisetti commented Jul 16, 2024

Bug report

Bug description:

White space causes inconsistent frame traversal in With blocks

Consider

with open( # 0
   a       # 1
) as b:    # 2
    pass   # 3

frame order goes:
2, 3, 0

versus

with open(a) as b: # 0
    pass           # 1

frame order goes 1, 0

This doesn't seem right, many formatters will make this adjustment as an expected NoOp.

Code to reproduce the behaviour:

import traceback
import sys

def trace(*args, **kwargs):
        stack = traceback.extract_stack()
        for frame in stack[::-1]:
            _filename, lineno, function_name, _code = frame
            if function_name in ("bar", "foo"):
                print("trace", frame.lineno)


class Context():
    def __enter__(self):
        frame = sys._getframe(1)
        self.old = frame.f_trace
        sys.settrace(lambda *_args, **_keys: None)
        frame.f_trace = trace

    def __exit__(self, *kwargs):
        sys.settrace(self.old)
def foo():
    with Context() as f:
        stack = traceback.extract_stack()
        print("---")
        for frame in stack[::-1]:
            _filename, lineno, function_name, _code = frame
            if function_name == "foo":
                print(frame.lineno, lineno)
                break
print("Foo")
foo()
Foo
trace 3
trace 4
---
trace 5
trace 6
trace 7
trace 8
3 3
trace 9
trace 2
def bar():
    with Context(
    ) as f: # NOTE: line break here !
        stack = traceback.extract_stack()
        print("---")
        for frame in stack[::-1]:
            _filename, lineno, function_name, _code = frame
            if function_name == "bar":
                print(frame.lineno, lineno)
                break

print("Bar")
bar()
Bar
trace 3
trace 4
trace 5
---
trace 6
trace 7
trace 8
trace 9
4 4
trace 10
trace 2

Expected behaviour

Foo
trace 3
trace 4
---
trace 5
trace 6
trace 7
trace 8
3 3
trace 9
trace 2
Bar
trace 4
trace 5
---
trace 6
trace 7
trace 8
trace 9
4 4
trace 10
trace 2 (or trace 3)

or maybe in the context of debuggers setting break points

Foo
trace 2
trace 3
trace 4
---
trace 5
trace 6
trace 7
trace 8
3 3
trace 9
trace 2
Bar
trace 3
trace 4
trace 5
---
trace 6
trace 7
trace 8
trace 9
4 4
trace 10
trace 2

Additional context

Please point me to where in the documentation if this is expected behavior,
The inconsistency seems like a bug to me.

Here's a wasm notebook, but I did test it locally as well: https://marimo.app/l/0eld5u

CPython versions tested on:

3.10, 3.11, 3.12

Operating systems tested on:

Linux, Other

@dmadisetti dmadisetti added the type-bug An unexpected behavior, bug, or error label Jul 16, 2024
@gaogaotiantian
Copy link
Member

Why do you think it's not right? The "format" will change the debugging(tracing) behavior because by default tracing is line based. If you have extra lines, you could have extra line events. In your second example, the as f part generates some bytecode to store the return value to f, which makes that line "relevant". If you remove the as f, that line won't be hit as there are no bytecode from that line. I don't think this is a bug.

@dmadisetti
Copy link
Author

So the reported behavior is consistent when there are multiple statements on a line:

fun1();fun2() # 0

The trace only lists 0 once. It is inconsistent in broken up statements:

fun( # 0
0)  # 1

Does 0, 1, 0

But with falls in the later camp. Back to

with open( # 0
   a       # 1
) as b:    # 2
    pass   # 3

It's one statement, the line breaks don't change that.
I agree with the relevant line statement, so 2, 3, 0 is correct.

but shouldn't it be 0,1,0 here?

with open(a) as b: # 0
    pass           # 1

@gaogaotiantian
Copy link
Member

with open(a) as b: # 0
    pass           # 1

I believe this is 0, 1, 0.

Could you be specific about which example you believe is wrong? I'm a bit confused. You gave plenty of examples and then you said some of them are good. The very last example that you did not like is not doing what you thought.

@gaogaotiantian
Copy link
Member

Again, the behavior of sys.settrace is simple - if something happens on a new line (a bytecode is executed), it reports an event. If you follow that rule, you'll realize that the examples you gave kind of make sense. sys.settrace does not care about the concept of statements or Python equivalents - it reports when a new line is executed.

@dmadisetti
Copy link
Author

I realize this comes across as a little pedantic and annoying, so let me relate this to the actual bug it created.

Here's some more tracing:

import traceback
import sys

def build_trace(context):
    def trace(*args, **kwargs):
        stack = traceback.extract_stack()
        for frame in stack[::-1]:
            _filename, lineno, function_name, _code = frame
            if function_name in ("bar", "foo", "__init__"):
                print(context, function_name, "trace", frame.lineno)
                if args:
                    parent_frame = args[0]
                    if "f" not in parent_frame.f_locals:
                        print("f not in locals, because the trace is inconsistent")
    return trace

class Context():
    def __init__(self):
        build_trace("init")()

    def __enter__(self):
        build_trace("enter")()
        frame = sys._getframe(1)
        self.old = frame.f_trace
        sys.settrace(lambda *_args, **_keys: None)
        frame.f_trace = build_trace("frame")
        print("enter has finished but f might not be available to the trace")

    def __exit__(self, *kwargs):
        sys.settrace(self.old)
def foo():
    with Context() as f:
        stack = traceback.extract_stack()
        print("---")
        for frame in stack[::-1][:3]:
            _filename, lineno, function_name, _code = frame
            if function_name == "foo":
                print(frame.lineno, lineno)
                break
print("Foo")
foo()
Foo
init __init__ trace 19
init foo trace 2
enter foo trace 2
enter has finished but f might not be available to the trace
frame foo trace 3
frame foo trace 4
---
frame foo trace 5
frame foo trace 6
frame foo trace 7
frame foo trace 8
3 3
frame foo trace 9
frame foo trace 2
def bar():
    with Context(
    ) as f:
        stack = traceback.extract_stack()
        print("---")
        for frame in stack[::-1][:3]:
            _filename, lineno, function_name, _code = frame
            if function_name == "bar":
                print(frame.lineno, lineno)
                break
            
print("Bar")
bar()
Bar
init __init__ trace 19
init bar trace 2
enter bar trace 2
enter has finished but f might not be available to the trace
frame bar trace 3
f not in locals, because the trace is inconsistent
frame bar trace 4
frame bar trace 5
---
frame bar trace 6
frame bar trace 7
frame bar trace 8
frame bar trace 9
4 4
frame bar trace 10
frame bar trace 2

WASM notebook: https://marimo.app/l/0aiv3o

@dmadisetti
Copy link
Author

dmadisetti commented Jul 16, 2024

Just looking for consistency. So expected behavior with the new output
To me:

Foo
init __init__ trace 19
init foo trace 2
enter foo trace 2
enter has finished but f is not available to the trace yet until after line 2
frame foo trace 2 <--- Added ! f is set here
f is not in locals and this is expected and consistent with the line break case
frame foo trace 3
frame foo trace 4
---
frame foo trace 5
frame foo trace 6
frame foo trace 7
frame foo trace 8
3 3
frame foo trace 9
frame foo trace 2

Makes sense, or

Bar
init __init__ trace 19
init bar trace 2
enter bar trace 3 <-- Changed from 2 to 3 since the `as` is on 3
enter has finished and f will be available to the trace
frame bar trace 4
frame bar trace 5
---
frame bar trace 6
frame bar trace 7
frame bar trace 8
frame bar trace 9
4 4
frame bar trace 10
frame bar trace 2

But the current behavior is unexpected

@gaogaotiantian
Copy link
Member

frame foo trace 2 <--- Added ! f is set here

No, f is set on line 3 in as f - like I mentioned above.

The reason

f not in locals, because the trace is inconsistent

Is because trace function runs before the line is executed, not after. Line 3 is about to execute, where f will be set to something, and you can't access it before it's set.

@dmadisetti
Copy link
Author

No, f is set on line 3 in as f - like I mentioned above.

line 3 for foo is not the as statement

def foo():                                # 1
    with Context() as f:                  # 2
        stack = traceback.extract_stack() # 3

Using foo and bar might be muddying the conversation- so might my quickly thrown together code.

trace function runs before the line is executed, not after. Line 3 is about to execute, where f will be set to something, and you can't access it before it's set.

Yes, I get that and why I suggested that __enter__ be associated with line 3 instead of line 2

@gaogaotiantian
Copy link
Member

Oh, my bad. Your bar suggestion is not preferred either.

__enter__ is called on line 2, with Context(). as f is something extra. Consider the code as

tmp = Context()  # __enter__
f = tmp

This is obviously not the same (tracing perspective) as

tmp = Context(); f = tmp

@gaogaotiantian
Copy link
Member

Again, sys.settrace() does not care about the Python concept of "statement". It does not treat with Context() as f as a whole. It does not consider other multi-line statement as a whole either. For example:

lst = [
1,
2,
3,
4,
5]

The code above will generate a line event each line. That's just how it is designed.

@dmadisetti
Copy link
Author

I get the multiline behavior, I agree.

Confused about your first case, doesn't:

tmp = Context()  # Construction not enter
f = tmp.__enter__() # f would be associated with line 2...

make more sense. If your example is what's currently happening, why is __enter__ implicitly done on 1? Did you mean

tmp = Context().__enter__()
f = tmp # in which case this frame is never useful, since line 1 is also visited, and tmp has no accessible interface at this point.

and

tmp = Context().__enter__(); f = tmp

? Sure, this is consistent with current behavior, but I guess that's the crux of the bug report; it's not obvious that the assignment is disassociated from the __enter__ call, and creates a meaningless frame. I agree with

tmp = Context(); f = tmp.__enter__() # f is associated with line 2, and trace only visits this once and skips, make sense

or with whitespace analogous to

tmp = Context(); (
f := tmp.__enter__()) # useful since before __enter__

or honestly any variation of a broken line (including separate lines like above).

It doesn't have anything to do with statements or shared lines; just the implementation detail which leaves an ambiguous state.

@gaogaotiantian
Copy link
Member

Oh sorry I made a mistake on my code.

with Context(
) as f

is equivalent to

ctx = Context(); tmp = ctx.__enter__()
f = tmp

That has nothing to do with tracing, that's how the compiler sees it.

with open("input.txt"
          ) as f:
    pass

compiles to

   0           RESUME                   0

   1           LOAD_NAME                0 (open)
               PUSH_NULL
               LOAD_CONST               0 ('input.txt')
               CALL                     1
               COPY                     1
               LOAD_SPECIAL             1 (__exit__)
               SWAP                     2
               SWAP                     3
               LOAD_SPECIAL             0 (__enter__)
               CALL                     0

   2   L1:     STORE_NAME               1 (f)

   3   L2:     NOP

   1           LOAD_CONST               1 (None)
               LOAD_CONST               1 (None)
               LOAD_CONST               1 (None)
               CALL                     3
               POP_TOP
               RETURN_CONST             1 (None)
       L3:     PUSH_EXC_INFO
               WITH_EXCEPT_START
               TO_BOOL
               POP_JUMP_IF_TRUE         1 (to L4)
               RERAISE                  2
       L4:     POP_TOP
       L5:     POP_EXCEPT
               POP_TOP
               POP_TOP
               POP_TOP
               RETURN_CONST             1 (None)

  --   L6:     COPY                     3
               POP_EXCEPT
               RERAISE                  1
ExceptionTable:
  L1 to L2 -> L3 [2] lasti
  L3 to L5 -> L6 [4] lasti

as f is an optional addition to with statement and __enter__() is obviously called before assigning to f. That's how Python sees it for a long time and I don't see a reason to change that.

The current implementation

  • does not contradict any documentation
  • has been like this for a long time

So if you want to change it, you need a really good reason to justify it, because it would be a breaking change. Up until now, I did not see anything concrete other than the fact you feel that this is weird. I'm not saying you are wrong, anyone can have their own opinions about how Python works, but your idea has to be good to be adopted.

For now, because there's no evidence that this is a bug (crashing/does not follow documentation/is different than before), I would suggest you ask about this in disclosure. This looks like a feature request to me.

@dmadisetti
Copy link
Author

Ok, I found the implementation and read into the pep, the behavior is as expected: https://peps.python.org/pep-0343/#specification-the-with-statement

Sorry about that

@terryjreedy terryjreedy closed this as not planned Won't fix, can't repro, duplicate, stale Jul 17, 2024
@dmadisetti
Copy link
Author

FWIW, 3.14 alpha does change the behavior to trace on the expression and not the with. This is good enough for me

@gaogaotiantian
Copy link
Member

FWIW, 3.14 alpha does change the behavior to trace on the expression and not the with. This is good enough for me

Could you elaborate on that? Not sure if that's intended.

@dmadisetti
Copy link
Author

Definitely intended, as it provides a better stack trace:

https://github.com/python/cpython/blame/main/Python/compile.c#L6008
#120125

In testing:
enter bar trace <line number of expression>
over
enter bar trace <line number of with>

You can get the behavior I proposed with:

diff --git a/Python/compile.c b/Python/compile.c
index ca64b5cd37..be17f68176 100644
--- a/Python/compile.c
+++ b/Python/compile.c
@@ -6011,8 +6011,13 @@ compiler_with(struct compiler *c, stmt_ty s, int pos)
     ADDOP_I(c, loc, SWAP, 2);
     ADDOP_I(c, loc, SWAP, 3);
     ADDOP_I(c, loc, LOAD_SPECIAL, SPECIAL___ENTER__);
-    ADDOP_I(c, loc, CALL, 0);
-    ADDOP_JUMP(c, loc, SETUP_WITH, final);
+
+    location as_loc = loc;
+    if (item->optional_vars) {
+        as_loc = LOC(item->optional_vars);
+    }
+    ADDOP_I(c, as_loc, CALL, 0);
+    ADDOP_JUMP(c, as_loc, SETUP_WITH, final);
 
     /* SETUP_WITH pushes a finally block. */
     USE_LABEL(c, block);

But it's marginal, and fine as is

@gaogaotiantian
Copy link
Member

Ah, I see what you meant. That's for the exception stack right? I don't think this has anything to do with tracing. The tracing examples you gave above still behave the same I think?

@dmadisetti
Copy link
Author

Tracing also leverages the same methods:

def bar():
    with (
        Context() as f):pass

print("Bar")
bar()

is

Bar
init __init__ trace 19
init bar trace 3
enter bar trace 2

and different with the newest version HEAD with

Bar
init __init__ trace 19
init bar trace 3
enter bar trace 3

Which I am happy with.

@gaogaotiantian
Copy link
Member

Okay, but this is because the line number is on Context() now instead of with. This still has nothing to do with as f.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants