Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

brance 2.0.0: Deadlocks can occur when the configuration of a subscription is updated very frequently #402

Closed
believening opened this issue Feb 11, 2022 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@believening
Copy link

Here's a simple example for replication, which may require a few more runs. listen config dataID "deadlock" in "DEFAULT" group, then update it very frequently, listener will be hang.

# build and run
➜  bug git:(master) ✗ go build -o bug  main.go
➜  bug git:(master) ✗ ls
bug  go.mod  go.sum  main.go
➜  bug git:(master) ✗ ./bug
2022/02/11 17:24:01 [INFO] logDir:</tmp/nacos/log>   cacheDir:</tmp/nacos/cache>
2022/02/11 17:24:06 [INFO] logDir:</tmp/nacos/log>   cacheDir:</tmp/nacos/cache>
2022-02-11T17:24:06.745+0800    INFO    rpc/rpc_client.go:276   config-0-22b283de-0d6a-4538-bcbd-2cfe673bd0f9 register server push request:ConfigChangeNotifyRequest handler%!(EXTRA *config_client.ConfigChangeNotifyRequestHandler=&{0xc000292000})
2022-02-11T17:24:06.745+0800    INFO    rpc/rpc_client.go:276   config-0-22b283de-0d6a-4538-bcbd-2cfe673bd0f9 register server push request:ConnectResetRequest handler%!(EXTRA *rpc.ConnectResetRequestHandler=&{})
2022-02-11T17:24:06.747+0800    INFO    rpc/rpc_client.go:276   config-0-83ac470d-f080-489e-842a-c6b771fe69d3 register server push request:ConfigChangeNotifyRequest handler%!(EXTRA *config_client.ConfigChangeNotifyRequestHandler=&{0xc00023a500})
2022-02-11T17:24:06.747+0800    INFO    rpc/rpc_client.go:276   config-0-83ac470d-f080-489e-842a-c6b771fe69d3 register server push request:ConnectResetRequest handler%!(EXTRA *rpc.ConnectResetRequestHandler=&{})
2022-02-11T17:24:06.747+0800    INFO    rpc/rpc_client.go:276   config-0-83ac470d-f080-489e-842a-c6b771fe69d3 register server push request:ClientDetectionRequest handler%!(EXTRA *rpc.ClientDetectionRequestHandler=&{})
2022-02-11T17:24:06.747+0800    INFO    rpc/rpc_client.go:276   config-0-22b283de-0d6a-4538-bcbd-2cfe673bd0f9 register server push request:ClientDetectionRequest handler%!(EXTRA *rpc.ClientDetectionRequestHandler=&{})
2022-02-11T17:24:06.747+0800    INFO    rpc/rpc_client.go:214   [RpcClient.Start] config-0-83ac470d-f080-489e-842a-c6b771fe69d3 try to connect to server on start up, server: {serverIp:127.0.0.1 serverPort:18848}
2022-02-11T17:24:06.748+0800    INFO    rpc/rpc_client.go:214   [RpcClient.Start] config-0-22b283de-0d6a-4538-bcbd-2cfe673bd0f9 try to connect to server on start up, server: {serverIp:127.0.0.1 serverPort:18848}
2022-02-11T17:24:06.748+0800    INFO    util/common.go:95       Local IP:172.17.0.1
2022-02-11T17:24:06.748+0800    INFO    util/common.go:95       Local IP:172.17.0.1
2022-02-11T17:24:06.855+0800    INFO    rpc/rpc_client.go:224   config-0-83ac470d-f080-489e-842a-c6b771fe69d3 success to connect to server {serverIp:127.0.0.1 serverPort:18848} on start up, connectionId=1644571446753_172.17.0.1_50468
2022-02-11T17:24:06.855+0800    INFO    rpc/rpc_client.go:224   config-0-22b283de-0d6a-4538-bcbd-2cfe673bd0f9 success to connect to server {serverIp:127.0.0.1 serverPort:18848} on start up, connectionId=1644571446753_172.17.0.1_50470
||DEFAULT|deadLock|0d4731b4-156c-4042-9821-a3f781605b56-0|
......
||DEFAULT|deadLock|e6782095-c3c8-4055-a54b-e56dd5fbcc8f-97|
2022-02-11T17:24:07.293+0800    INFO    config_client/config_proxy.go:172       config-0-83ac470d-f080-489e-842a-c6b771fe69d3 [server-push] config changed. dataId=deadLock, group=DEFAULT,tenant=
||DEFAULT|deadLock|f08f4695-4fab-4e90-a23b-bb59e99f5589-98|
last one:  bdeb1567-0c79-467f-894e-737be3b4f531
2022-02-11T17:24:07.395+0800    INFO    config_client/config_proxy.go:172       config-0-83ac470d-f080-489e-842a-c6b771fe69d3 [server-push] config changed. dataId=deadLock, group=DEFAULT,tenant=

As you can see, the last data is bdeb1567-0c79-467f-894e-737be3b4f531, but the listener doesn't call the callback function to print it

# debug
➜  bug git:(master) ✗ ps -ef | grep "bug" | grep -v "grep"
root      7394 17775  1 17:24 pts/18   00:00:00 ./bug
➜  bug git:(master) ✗ dlv attach 7394
Type 'help' for list of commands.
(dlv) grs
   ......
  Goroutine 6 - User: /root/go/pkg/mod/github.com/nacos-group/nacos-sdk-go/[email protected]/clients/config_client/config_client.go:474 
  ......
  Goroutine 119 - User: /root/go/pkg/mod/github.com/nacos-group/nacos-sdk-go/[email protected]/clients/config_client/config_client.go:474 github.com/nacos-group/nacos-sdk-go/v2/clients/config_client.(*ConfigChangeNotifyRequestHandler).RequestReply (0x9da745) [chan send]
  ......
[34 goroutines]
(dlv) gr 6 bt
0  0x0000000000437af6 in runtime.gopark
   at /usr/local/go/src/runtime/proc.go:367
1  0x0000000000406f85 in runtime.chansend
   at /usr/local/go/src/runtime/chan.go:257
2  0x0000000000406b3d in runtime.chansend1
   at /usr/local/go/src/runtime/chan.go:143
3  0x00000000009d80f7 in github.com/nacos-group/nacos-sdk-go/v2/clients/config_client.(*ConfigClient).notifyListenConfig
   at /root/go/pkg/mod/github.com/nacos-group/nacos-sdk-go/[email protected]/clients/config_client/config_client.go:474
4  0x00000000009d80f7 in github.com/nacos-group/nacos-sdk-go/v2/clients/config_client.(*ConfigClient).executeConfigListen
   at /root/go/pkg/mod/github.com/nacos-group/nacos-sdk-go/[email protected]/clients/config_client/config_client.go:436
5  0x00000000009d7b10 in github.com/nacos-group/nacos-sdk-go/v2/clients/config_client.(*ConfigClient).startInternal.func1
   at /root/go/pkg/mod/github.com/nacos-group/nacos-sdk-go/[email protected]/clients/config_client/config_client.go:356
6  0x00000000004646a1 in runtime.goexit
   at /usr/local/go/src/runtime/asm_amd64.s:1581
(dlv) gr 119 bt
0  0x0000000000437af6 in runtime.gopark
   at /usr/local/go/src/runtime/proc.go:367
1  0x0000000000406f85 in runtime.chansend
   at /usr/local/go/src/runtime/chan.go:257
2  0x0000000000406b3d in runtime.chansend1
   at /usr/local/go/src/runtime/chan.go:143
3  0x00000000009da745 in github.com/nacos-group/nacos-sdk-go/v2/clients/config_client.(*ConfigClient).notifyListenConfig
   at /root/go/pkg/mod/github.com/nacos-group/nacos-sdk-go/[email protected]/clients/config_client/config_client.go:474
4  0x00000000009da745 in github.com/nacos-group/nacos-sdk-go/v2/clients/config_client.(*ConfigChangeNotifyRequestHandler).RequestReply
   at /root/go/pkg/mod/github.com/nacos-group/nacos-sdk-go/[email protected]/clients/config_client/config_proxy.go:183
5  0x000000000085e1d5 in github.com/nacos-group/nacos-sdk-go/v2/common/remote/rpc.(*GrpcClient).handleServerRequest
   at /root/go/pkg/mod/github.com/nacos-group/nacos-sdk-go/[email protected]/common/remote/rpc/grpc_client.go:244
6  0x000000000085dc25 in github.com/nacos-group/nacos-sdk-go/v2/common/remote/rpc.(*GrpcClient).bindBiRequestStream.func1
   at /root/go/pkg/mod/github.com/nacos-group/nacos-sdk-go/[email protected]/common/remote/rpc/grpc_client.go:191
7  0x00000000004646a1 in runtime.goexit
   at /usr/local/go/src/runtime/asm_amd64.s:1581
(dlv)

As you can see, both gr 16 and gr 119 want to sent to chan. gr 6 is triggered by data-98, executes a callback, and finds data-99's update and wants to write to chan listenExecute. however, data-99's update server initiates a push and chan listenExecute is already written by gr 119. gr 6 cannot exit the current update logic and chan listenExecute cannot be read.

@believening
Copy link
Author

if hasChangedKeys {
client.notifyListenConfig()
}

a simple fix like:


	if hasChangedKeys {
		go client.notifyListenConfig()
	}

@binbin0325 binbin0325 added the bug Something isn't working label Feb 11, 2022
@binbin0325 binbin0325 self-assigned this Feb 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants