-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#31 resurfacing (?) #170
Comments
Your test is triggering quite an edge case, where the background frame reader receives an error and returns, before |
Related to #170. There is an edge case where the background frame handler receives an error before `Open` completes. In this case, there is a data race to assign the value of the allocator. The destructor in shutdown is already a critical section. Adding a tiny critical section in openComplete to protect the allocator. Signed-off-by: Aitor Perez Cedres <[email protected]>
I've pushed a fix in commit c503330, also in branch |
I think you will have to also add locking around https://github.com/rabbitmq/amqp091-go/blob/main/connection.go#L555 I don't know the lifetime of the |
The mutex Lines 522 to 524 in c503330
|
I see, my bad. Will test it later (Y) |
@Zerpet looks good. I re-ran my test 20 times and could not reproduce. Edit: |
In order to reproduce:
|
Related to #170. There is an edge case where the TCP connection between client-server is setup, and AMQP handshake starts, up to the point right after sending Tune frame. At this point, the background frame reader receives an error from the TCP socket and starts the shutdown sequence. At the same time, the Tune function continues (as it has not completed) and attempts to set the ChannelMax field in the connection struct. At the same time, the shutdown sequence initiated by the error in the frame handler reads the ChannelMax field. This creates a race. A potential solution is to add a critical section in tune to protect access to ChannelMax field. The destructor in the shutdown sequence is already a critical section, protected by the struct mutex. Signed-off-by: Aitor Perez Cedres <[email protected]>
I've run the tests 3 times with |
I also cannot reproduce it with the new version. |
* Lock allocator assignment in openComplete Related to #170. There is an edge case where the background frame handler receives an error before `Open` completes. In this case, there is a data race to assign the value of the allocator. The destructor in shutdown is already a critical section. Adding a tiny critical section in openComplete to protect the allocator. Signed-off-by: Aitor Perez Cedres <[email protected]> * Add a critical section in openTune Related to #170. There is an edge case where the TCP connection between client-server is setup, and AMQP handshake starts, up to the point right after sending Tune frame. At this point, the background frame reader receives an error from the TCP socket and starts the shutdown sequence. At the same time, the Tune function continues (as it has not completed) and attempts to set the ChannelMax field in the connection struct. At the same time, the shutdown sequence initiated by the error in the frame handler reads the ChannelMax field. This creates a race. A potential solution is to add a critical section in tune to protect access to ChannelMax field. The destructor in the shutdown sequence is already a critical section, protected by the struct mutex. Signed-off-by: Aitor Perez Cedres <[email protected]> --------- Signed-off-by: Aitor Perez Cedres <[email protected]>
Hi,
I'm currently building a derived library with reconnect handling and my tests fail due to the already closed issue:
#31
In that issue tests were fixed with
t.Cleanup
but the actual data race was, as far as I can see, never tackled (?)Where the assignment to allocator does not seem to be threadsafe (goroutinesafe) resulting in data race detections.
Code lines:
https://github.com/rabbitmq/amqp091-go/blob/main/connection.go#L555
https://github.com/rabbitmq/amqp091-go/blob/main/connection.go#L977
entrypoint for asynchronous memory access (both routines access
c.allocator
:https://github.com/rabbitmq/amqp091-go/blob/main/connection.go#L268-L269
Do I do anything incorrectly?
(test case, which is highly dependent on the go runtime scheduler thus not always triggering the data race: https://github.com/jxsl13/amqpx/blob/batch-handler/pool/connection_test.go#L82)
Any feedback is appreciated.
Thanks for your time.
The text was updated successfully, but these errors were encountered: