Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geth stop syncing with network for 10 hours, followed by too many open files error #16883

Closed
calvinaco opened this issue Jun 4, 2018 · 4 comments

Comments

@calvinaco
Copy link

calvinaco commented Jun 4, 2018

System information

Geth version: Geth/v1.8.9-stable-ff9b1461/linux-amd64/go1.10
OS & Version: Ubuntu 16.04
Commit hash :

Expected behaviour

Geth should keep in sync with the network

Actual behaviour

  • geth is keep running but suddenly stop syncing with the network for 10 hours
    • JSONRPC server is still running returning the outdated block number during this period of time
    • Running net.peerCount shows 25 peers is connected consistently at the moment
  • After ~10 hours, geth starts to throw http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 5ms and never stop

Steps to reproduce the behaviour

It happens kind of random. But the pattern usually happens after running the node for a week time.

Backtrace

2018-06-04_10:30:29.51348 INFO [06-04|10:30:29] Imported new chain segment               blocks=1  txs=133  mgas=7.972   elapsed=261.803ms  mgasps=30.452  number=5730266 hash=1e6af4�~@�34b7a8 cache=250.74mB
2018-06-04_10:30:54.74518 INFO [06-04|10:30:54] Imported new chain segment               blocks=1  txs=72   mgas=7.962   elapsed=425.775ms  mgasps=18.701  number=5730264 hash=615c90�~@�406f96 cache=251.10mB
2018-06-04_10:30:59.70970 INFO [06-04|10:30:59] Imported new chain segment               blocks=1  txs=195  mgas=7.996   elapsed=299.451ms  mgasps=26.702  number=5730267 hash=232e98�~@�915a2b cache=251.58mB
2018-06-04_10:31:08.82194 INFO [06-04|10:31:08] Imported new chain segment               blocks=1  txs=221  mgas=7.987   elapsed=302.543ms  mgasps=26.399  number=5730268 hash=f2c702�~@�8b6f57 cache=251.73mB
2018-06-04_10:31:09.78174 INFO [06-04|10:31:09] Imported new chain segment               blocks=1  txs=168  mgas=6.782   elapsed=136.483ms  mgasps=49.694  number=5730267 hash=46c08e�~@�6c1699 cache=251.87mB
2018-06-04_10:31:32.31166 INFO [06-04|10:31:32] Imported new chain segment               blocks=1  txs=71   mgas=7.910   elapsed=86.345ms   mgasps=91.614  number=5730267 hash=586b8f�~@�d4586c cache=251.93mB
2018-06-04_10:31:36.68506 INFO [06-04|10:31:36] Imported new chain segment               blocks=1  txs=97   mgas=7.965   elapsed=218.409ms  mgasps=36.470  number=5730269 hash=96ed21�~@�09f5a3 cache=251.66mB
2018-06-04_10:31:41.04692 INFO [06-04|10:31:41] Imported new chain segment               blocks=1  txs=144  mgas=7.996   elapsed=265.208ms  mgasps=30.150  number=5730270 hash=da0e68�~@�593e30 cache=251.56mB
2018-06-04_10:31:58.49037 INFO [06-04|10:31:58] Imported new chain segment               blocks=1  txs=86   mgas=7.985   elapsed=189.417ms  mgasps=42.154  number=5730271 hash=934fe6�~@�597ac3 cache=251.92mB
2018-06-04_10:32:02.94437 INFO [06-04|10:32:02] Imported new chain segment               blocks=1  txs=100  mgas=7.992   elapsed=181.368ms  mgasps=44.063  number=5730272 hash=5d3417�~@�08d039 cache=251.86mB
2018-06-04_10:32:15.92040 INFO [06-04|10:32:15] Imported new chain segment               blocks=1  txs=65   mgas=7.993   elapsed=188.916ms  mgasps=42.309  number=5730273 hash=9c1994�~@�e7faaa cache=251.49mB
2018-06-04_10:32:17.50921 INFO [06-04|10:32:17] Imported new chain segment               blocks=1  txs=285  mgas=8.000   elapsed=412.995ms  mgasps=19.370  number=5730273 hash=c397e8�~@�6568f1 cache=252.19mB
2018-06-04_10:32:28.17922 INFO [06-04|10:32:28] Imported new chain segment               blocks=1  txs=179  mgas=7.985   elapsed=417.219ms  mgasps=19.138  number=5730274 hash=df4fa7�~@�e02efd cache=252.42mB
2018-06-04_10:32:28.55545 INFO [06-04|10:32:28] Imported new chain segment               blocks=1  txs=169  mgas=7.981   elapsed=242.306ms  mgasps=32.939  number=5730274 hash=541f16�~@�3144ad cache=252.60mB
2018-06-04_10:32:36.66976 INFO [06-04|10:32:36] Imported new chain segment               blocks=1  txs=326  mgas=8.000   elapsed=253.852ms  mgasps=31.513  number=5730275 hash=4eb8c8�~@�317ff7 cache=252.40mB
2018-06-04_10:33:44.79329 INFO [06-04|10:33:44] Imported new chain segment               blocks=2  txs=2389 mgas=15.965  elapsed=475.924ms  mgasps=33.546  number=5730276 hash=a94b31�~@�bb6bbb cache=253.01mB ignored=15
2018-06-05_00:59:26.04264 2018/06/05 00:59:26 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 5ms
2018-06-05_00:59:26.04775 2018/06/05 00:59:26 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 10ms
2018-06-05_00:59:26.05786 2018/06/05 00:59:26 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 20ms
2018-06-05_00:59:26.07795 2018/06/05 00:59:26 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 40ms
2018-06-05_00:59:26.11809 2018/06/05 00:59:26 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 80ms
2018-06-05_00:59:32.34530 2018/06/05 00:59:32 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 5ms
2018-06-05_00:59:32.35039 2018/06/05 00:59:32 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 10ms
2018-06-05_00:59:32.36056 2018/06/05 00:59:32 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 5ms
2018-06-05_00:59:32.36568 2018/06/05 00:59:32 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 10ms
2018-06-05_00:59:32.37582 2018/06/05 00:59:32 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 20ms
2018-06-05_00:59:32.39594 2018/06/05 00:59:32 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 40ms
2018-06-05_00:59:32.43605 2018/06/05 00:59:32 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 80ms
2018-06-05_00:59:32.51616 2018/06/05 00:59:32 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 160ms
2018-06-05_00:59:32.67632 2018/06/05 00:59:32 http: Accept error: accept tcp [::]:8545: accept4: too many open files; retrying in 320ms

debug.stacks()
debug_stacks.log

@calvinaco
Copy link
Author

calvinaco commented Jun 4, 2018

Attaching CPU and network utilization during the "stop syncing" period for reference

screen shot 2018-06-04 at 7 08 12 pm

@calvinaco calvinaco changed the title Geth stop syncing with network for an hour while still running Geth stop syncing with network for for 10 hours, followed by too many open files Jun 5, 2018
@calvinaco calvinaco changed the title Geth stop syncing with network for for 10 hours, followed by too many open files Geth stop syncing with network for for 10 hours, followed by too many open files error Jun 5, 2018
@calvinaco calvinaco changed the title Geth stop syncing with network for for 10 hours, followed by too many open files error Geth stop syncing with network for 10 hours, followed by too many open files error Jun 5, 2018
@karalabe
Copy link
Member

karalabe commented Jun 5, 2018

Are you running a publicly open RPC server? That is probably a bad idea as anyone from the internet can grief your node. We've added some limits on master that forcefully close idle connections, which should definitely help, but please make sure you indeed want the entire internet to access your machine.

@calvinaco
Copy link
Author

Thanks @karalabe . I am running the node on a trial but definitely I will restrict the access later. For the limits you mention, may I know if you are referring to #16880 ?

@stale
Copy link

stale bot commented Jul 1, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the status:inactive label Jul 1, 2019
@stale stale bot closed this as completed Aug 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants