Long tx broadcasting delays on geth 1.8.6 #16617

vogelito · 2018-04-30T22:41:48Z

System information

Geth version:

Geth
Version: 1.8.6-stable
Git Commit: 12683feca7483f0b0bf425c3c520e2724f69f2aa
Architecture: amd64
Protocol Versions: [63 62]
Network Id: 1
Go Version: go1.10
Operating System: linux
GOPATH=
GOROOT=/usr/local/go

OS & Version: OSX

Expected behaviour

After calling eth.sendTransaction, the local node should immediately broadcast the transaction to other nodes in the network.

Actual behaviour

After upgrading to 1.8.6 (from 1.8.3) we see that our transactions (sent by calling eth.sendTransaction) are not seen by other nodes for 15-30 minutes. Our node has peers and it is fully synced.

Steps to reproduce the behaviour

Unsure, we just upgraded to 1.8.6 and noticed the issue about 32 hours afterwards. We don't have active monitoring for this, so unsure whether the condition degraded or if started happening since the upgrade.

Backtrace

Attaching debug.stack() output: 2018.04.30_geth_1.8.6_debug_stacks.log

The text was updated successfully, but these errors were encountered:

mtbitcoin · 2018-05-03T06:33:34Z

i can concur that I've observed the same, especially with the pending transactions. It takes awhile to get propogated to the other nodes

djken2006 · 2018-05-03T07:51:57Z

When subscribe to pending transactions, I receive transactions which were already mined several minutes ago.
Version: 1.8.6-stable and Version: 1.8.7-stable

ryanschneider · 2018-05-03T20:26:31Z

Related to #14669.

I have a code change here ryanschneider@7b4f6c0 that we've been running that seems to help.

Notice in the debug.stack the large number of promoteTx calls with a very long times blocked on chan receive. We were seeing the same issue until we started running with the above commit (and the other changes I mentioned in #14669).

However, of the 3 changes, the one above is definitely the safest, so I'll go ahead and send it as PR and reference this issue.

vogelito · 2018-05-04T13:52:43Z

Thanks @ryanschneider

holiman · 2018-05-18T10:25:18Z

Some update on this, we've indentifed a couple of quirks which affects both memory and transaction propagation. The fixes are in progress.

vogelito · 2018-06-05T04:08:44Z

Thanks @holiman - we have secondary nodes and scripts to rebroadcast transactions to fix this issue. Should we attempt turning them off to see if geth is handling things correctly now? We're running geth v1.8.10

holiman · 2018-06-05T07:40:41Z

The memory problems that were basically fixed by changing the broadcast process. Earlier, we basically had queue which is broadcast serially to all peers. A problematic peer could jam the broadcast, causing two problems:

Hanging/blocked go-routines, eating memory
Failure to broadcast the transactions to the other peers

We changed that to have N channels for N peers. So in case a peer is problematic, that does not affect the broadcast to the other peers. And also, the channel to that peer will start dropping messages once it becomes full, so it does not just pile up in memory.

So I believe it is fixed, and it would be great to get confirmation of that -- if you disable your custom broadcaster and checks if it seems to work.

vogelito · 2018-06-05T07:49:12Z

Sounds good, we'll do that and report back

stale · 2019-06-06T08:45:58Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

adamschmideg · 2019-10-17T08:09:57Z

I assume this issue is already resolved in the current version. If you experience a similar problem, please open a new issue.

vogelito · 2019-10-17T17:17:48Z

Sorry for never reporting back. I can confirm that we no longer see this issue. Thanks, as always, for your help.

ryanschneider mentioned this issue May 3, 2018

core: use channel for promoted transactions to improve order during high volume #16673

Closed

stale bot added the status:inactive label Jun 6, 2019

adamschmideg closed this as completed Oct 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long tx broadcasting delays on geth 1.8.6 #16617

Long tx broadcasting delays on geth 1.8.6 #16617

vogelito commented Apr 30, 2018

mtbitcoin commented May 3, 2018

djken2006 commented May 3, 2018 •

edited

Loading

ryanschneider commented May 3, 2018

vogelito commented May 4, 2018

holiman commented May 18, 2018

vogelito commented Jun 5, 2018

holiman commented Jun 5, 2018

vogelito commented Jun 5, 2018

stale bot commented Jun 6, 2019

adamschmideg commented Oct 17, 2019

vogelito commented Oct 17, 2019

Long tx broadcasting delays on geth 1.8.6 #16617

Long tx broadcasting delays on geth 1.8.6 #16617

Comments

vogelito commented Apr 30, 2018

System information

Expected behaviour

Actual behaviour

Steps to reproduce the behaviour

Backtrace

mtbitcoin commented May 3, 2018

djken2006 commented May 3, 2018 • edited Loading

ryanschneider commented May 3, 2018

vogelito commented May 4, 2018

holiman commented May 18, 2018

vogelito commented Jun 5, 2018

holiman commented Jun 5, 2018

vogelito commented Jun 5, 2018

stale bot commented Jun 6, 2019

adamschmideg commented Oct 17, 2019

vogelito commented Oct 17, 2019

djken2006 commented May 3, 2018 •

edited

Loading