Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when Nimbus is paired with a web3 provider over HTTP #3521

Closed
zah opened this issue Mar 18, 2022 · 6 comments
Closed

Crash when Nimbus is paired with a web3 provider over HTTP #3521

zah opened this issue Mar 18, 2022 · 6 comments

Comments

@zah
Copy link
Contributor

zah commented Mar 18, 2022

The user supplied a HTTP --web3-url end-point provided by a fully synced non-archive Erigon instance. The following crash was observed:

Traceback (most recent call last, using override)
nimbus-eth2-amd64-latest | /home/user/nimbus-eth2/vendor/nim-json-rpc/json_rpc/client.nim(381) main
nimbus-eth2-amd64-latest | /home/user/nimbus-eth2/vendor/nim-json-rpc/json_rpc/client.nim(374) NimMain
nimbus-eth2-amd64-latest | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1930) main
nimbus-eth2-amd64-latest | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1799) handleStartUpCmd
nimbus-eth2-amd64-latest | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1644) doRunBeaconNode
nimbus-eth2-amd64-latest | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1456) start
nimbus-eth2-amd64-latest | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1400) run
nimbus-eth2-amd64-latest | /home/user/nimbus-eth2/vendor/nim-chronos/chronos/asyncloop.nim(279) poll
nimbus-eth2-amd64-latest | /home/user/nimbus-eth2/vendor/nim-chronos/chronos/transports/stream.nim(1369) readStreamLoop
nimbus-eth2-amd64-latest | /home/user/nimbus-eth2/vendor/nimbus-build-system/vendor/Nim/lib/system/excpt.nim(610) signalHandler
nimbus-eth2-amd64-latest | SIGSEGV: Illegal storage access. (Attempt to read from nil?)

Using a WebSocket URL with the same Erigon instance worked fine.

@zah zah changed the title Crash when Nimbus is paired with a wev3 provider over HTTP Crash when Nimbus is paired with a web3 provider over HTTP Mar 18, 2022
@zah
Copy link
Contributor Author

zah commented Mar 18, 2022

A similar crash with a slightly different call stack:

/opt/nimbus/vendor/nim-libp2p/libp2p/stream/bufferstream.nim(438) main
/opt/nimbus/vendor/nim-libp2p/libp2p/stream/bufferstream.nim(431) NimMain
/opt/nimbus/beacon_chain/nimbus_beacon_node.nim(1842) main
/opt/nimbus/beacon_chain/nimbus_beacon_node.nim(1637) doRunBeaconNode
/opt/nimbus/beacon_chain/nimbus_beacon_node.nim(1449) start
/opt/nimbus/beacon_chain/nimbus_beacon_node.nim(1393) run
/opt/nimbus/vendor/nim-chronos/chronos/asyncloop.nim(279) poll
/opt/nimbus/vendor/nim-chronos/chronos/transports/datagram.nim(436) readDatagramLoop
/opt/nimbus/vendor/nimbus-build-system/vendor/Nim/lib/system/excpt.nim(610) signalHandler
SIGSEGV: Illegal storage access. (Attempt to read from nil?)

@fkbenjamin
Copy link

I believe I'm seeing the same:

Mar 23 14:06:11 nim-01 nimbus_beacon_node[83229]: INF 2022-03-23 14:06:11.014+01:00 Slot end                                   topics="beacnde" slot=3434729 nextActionWait=n/a nextAttestationSlot=-1 nextProposalSlot=-1 head=d38358f3:3434728
Mar 23 14:06:23 nim-01 nimbus_beacon_node[83229]: INF 2022-03-23 14:06:23.000+01:00 Slot start                                 topics="beacnde" slot=3434730 epoch=107335 sync=synced peers=14 head=d6a846c0:3434729 finalized=107333:0c97af0d delay=959us872ns
Mar 23 14:06:23 nim-01 nimbus_beacon_node[83229]: INF 2022-03-23 14:06:23.015+01:00 Slot end                                   topics="beacnde" slot=3434730 nextActionWait=n/a nextAttestationSlot=-1 nextProposalSlot=-1 head=d6a846c0:3434729
Mar 23 14:06:35 nim-01 nimbus_beacon_node[83229]: INF 2022-03-23 14:06:35.000+01:00 Slot start                                 topics="beacnde" slot=3434731 epoch=107335 sync=synced peers=10 head=cec242ee:3434730 finalized=107333:0c97af0d delay=359us305ns
Mar 23 14:06:35 nim-01 nimbus_beacon_node[83229]: INF 2022-03-23 14:06:35.014+01:00 Slot end                                   topics="beacnde" slot=3434731 nextActionWait=n/a nextAttestationSlot=-1 nextProposalSlot=-1 head=cec242ee:3434730
Mar 23 14:06:47 nim-01 nimbus_beacon_node[83229]: INF 2022-03-23 14:06:47.000+01:00 Slot start                                 topics="beacnde" slot=3434732 epoch=107335 sync=synced peers=9 head=08284a76:3434731 finalized=107333:0c97af0d delay=696us937ns
Mar 23 14:06:47 nim-01 nimbus_beacon_node[83229]: INF 2022-03-23 14:06:47.016+01:00 Slot end                                   topics="beacnde" slot=3434732 nextActionWait=n/a nextAttestationSlot=-1 nextProposalSlot=-1 head=08284a76:3434731
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: Traceback (most recent call last, using override)
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: /home/sfadmin/nimbus-eth2/vendor/nim-json-rpc/json_rpc/client.nim(381) main
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: /home/sfadmin/nimbus-eth2/vendor/nim-json-rpc/json_rpc/client.nim(374) NimMain
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: /home/sfadmin/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1930) main
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: /home/sfadmin/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1799) handleStartUpCmd
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: /home/sfadmin/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1644) doRunBeaconNode
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: /home/sfadmin/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1456) start
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: /home/sfadmin/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1400) run
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: /home/sfadmin/nimbus-eth2/vendor/nim-chronos/chronos/asyncloop.nim(279) poll
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: /home/sfadmin/nimbus-eth2/vendor/nim-chronos/chronos/transports/stream.nim(1369) readStreamLoop
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: /home/sfadmin/nimbus-eth2/vendor/nimbus-build-system/vendor/Nim/lib/system/excpt.nim(610) signalHandler
Mar 23 14:06:54 nim-01 nimbus_beacon_node[83229]: SIGSEGV: Illegal storage access. (Attempt to read from nil?)
Mar 23 14:06:54 nim-01 systemd[1]: nimbus.service: Main process exited, code=exited, status=1/FAILURE```

arnetheduck added a commit to status-im/nim-chronos that referenced this issue Apr 9, 2022
The socket selector holds a `seq` of per-descriptor data. When a reader
is registered, a pointer to a seq item is stored - when the `seq` grows,
this pointer becomes dangling and causes crashes like
status-im/nimbus-eth2#3521.

It turns out that there already exist two mechanisms for passing user
data around - this PR simply removes one of them, saving on memory usage
and removing the need to store pointers to the `seq` data that become
dangling on resize.
arnetheduck added a commit to status-im/nim-chronos that referenced this issue Apr 11, 2022
The socket selector holds a `seq` of per-descriptor data. When a reader
is registered, a pointer to a seq item is stored - when the `seq` grows,
this pointer becomes dangling and causes crashes like
status-im/nimbus-eth2#3521.

It turns out that there already exist two mechanisms for passing user
data around - this PR simply removes one of them, saving on memory usage
and removing the need to store pointers to the `seq` data that become
dangling on resize.
arnetheduck added a commit that referenced this issue Apr 11, 2022
* fixes the crash part of #3521, which in turn is a result of the leaks
fixed in #3582
@tersec
Copy link
Contributor

tersec commented Apr 11, 2022

#3582

@tersec tersec closed this as completed Apr 11, 2022
tersec added a commit that referenced this issue Apr 13, 2022
tersec added a commit that referenced this issue Apr 14, 2022
tersec added a commit that referenced this issue Apr 15, 2022
@BenedettiLucca
Copy link

Got the same problem on mainnet with besu as execution client. Changing to websocket solved the crashes.

@etan-status
Copy link
Contributor

@BenedettiLucca: Did you still have the problems on mainnet with Nimbus v22.9, or was this using an older version of Nimbus?

@BenedettiLucca
Copy link

@BenedettiLucca: Did you still have the problems on mainnet with Nimbus v22.9, or was this using an older version of Nimbus?

I'm using nimbus 22.9 and besu 22.7.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants