-
Notifications
You must be signed in to change notification settings - Fork 20.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory used increasing slowly #17450
Comments
Opening up the HTTP/WebSocket interfaces to outside traffic is dangerous because people are actively trying to break into nodes. Are you sure you need remote access to your node via HTTP? That should only be used if you're behind a firewall and can control access. Couldn't you use SSH + IPC to attack to a remote node? With regard to memory use and RPC, what requests are you making? I can imagine that there might be some leak in our code, but providing some details about your usage could be invaluable to track it down. |
Hi, yes, but i need to hit the smart contract from any origin. Is there any way to accomplish that without opening up the HTTP interface? is a private network that must be accessed from any origin (metamask, myetherwallet, any etherem wallet..) About memory, the memory usage increase with every transaction submitted. I thought that it might be related to the garbage collector, maybe isnt executing... i have send 100 transactions to be submitted and that increased my ram usage about 50MB Is there any information that i can provide in order to identify the possible leak? I think that the problem might be related with the fact that the RPC port is opened and anyone could be making something that reserves memory... the node started with: geth --datadir e1/ --syncmode 'full' --port 30357 --rpc --rpcport 8545 --rpccorsdomain '*' --rpcaddr 'server_ip' --ws --wsaddr "server_ip" --wsorigins "some_ip" --wsport 9583 --wsapi 'db,eth,net,web3,txpool,miner' --networkid 21 --gasprice '1' so there is no exposed rpcapi, only ws but from some specific IP any ideas of how i can troubleshoot this? it is only increasing when i send transactions to the blockchain using metamask.. |
Recently i get attached to a console and send 400 transactions (hitting a Smart contract) in a for loop. The memory used increased 400mb any idea of what can i research in order to check what is causing this? After 20 - 30 minutes it goes back to the previous ram.. or some MB more.. is this normal? the tendency is that the time passes and the memory used increases when transactions are submitted.. |
I have tried on another chain that doesnt expose RPC endpoint, sending 5000 transactions from geth console. Memory started at 1140 MB and after 5000 transactions grows to 1550 MB, so geth takes 400 MB in order to process those transactions. Since the block time is 15 seconds, it will take a while to confirm those transactions, so .. it is normal to stay into 1550MB for a while ? also.. the cache memory is still increasing.. is there anything that i can share with us in order to check if this behavior is ok? it seems like after 10k transactions the cache used by Geth growed 400mb and the used memory also increased.. the values are not constant after sending a batch of transactions, maybe is standard for geth to consume more memory the more heavy the chain is Also, without RPC interactions the cache used increases. |
The problem gets worse in 1.8.20, I have to restart geth node every half day due to the high memory usage. |
Same problem here Version: 1.8.16-stable top - 00:58:06 up 21:28, 3 users, load average: 2.39, 2.18, 1.28 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND |
I have the same problem. It seems there're memory leak somewhere. And it seems that if node is mining and open the rpc endpoint have no problem. If the node is not mining then the memory increase regularly as @marcosmartinez7 report when sending thousand of tnx continuously via rpc endpoint. |
On 1.8.18 version i havent experimented anymore this problem, i mean, the memory is increasing but it reachs a stable max value. T ake in care that geth has in memory some information that is written to disk on each epoch, and an epoch is about 20-30k blocks. So if youre using less than 4GB of RAM i think this can happen. |
Thank you for your information @marcosmartinez7 By the way, I still can reproduce the issue on 1.8.20 & 1.8.21 with 16GB of RAM by sending 10 thousands of tnx (2000 txs/block sealing & txpool always fill up with ~10,000 txs) And one thing very strange is mining node with public RPC endpoint don't have the issue on the same computing configuration. So I think there's must be memory leak somewhere. un-mining (memory leak) |
almost the goroutine leak is calling feed.Send() as below
|
the goroutine leak might be the case of feed.Send() blocking issue reported on #18021 |
The issue is not that Feed.Send() is blocking, it's that the send to the feed happens in a background goroutine. Please provide a longer stack trace so we can see which part of the system is trying to send on the feed. |
I think that the issue happen as @liuzhijun23 figure out in the #18021 when the for loop is blocked in feed.Send() then another call will be stuck at line 133
it's almost from the txpool
|
@fjl a clue: on no-mining node, the if cases[i].Chan.TrySend(rvalue) {
nsent++
cases = cases.deactivate(i)
i--
} |
@hadv can you produce a full trace and upload somewhere? There's a That should show what particular receiver is bottlenecking the events OBS: if you do |
@holiman hope this will help https://drive.google.com/open?id=1xK6F95bmLsleuvM9qMQaH2Y7K9Gk2ZFD |
You've got
If you are batch-adding thousands of transactions, and doing an |
@holiman Can you please explain us why only not-mining node need to run below code? That's might be the reason why only not-mining node face the goroutine leak issue, right? Thank you! case ev := <-w.txsCh:
// Apply transactions to the pending state if we're not mining.
//
// Note all transactions received may not be continuous with transactions
// already included in the current mining block. These transactions will
// be automatically eliminated.
if !w.isRunning() && w.current != nil {
w.mu.RLock()
coinbase := w.coinbase
w.mu.RUnlock()
txs := make(map[common.Address]types.Transactions)
for _, tx := range ev.Txs {
acc, _ := types.Sender(w.current.signer, tx)
txs[acc] = append(txs[acc], tx)
}
txset := types.NewTransactionsByPriceAndNonce(w.current.signer, txs)
w.commitTransactions(txset, coinbase, nil)
w.updateSnapshot()
} else { |
I don't know yet.. However, there appears to be ~10K routines spawned by I think an underlying problem is that a better model for the transaction handling would be to use active objects (one thread/routine) which receives data, instead of each sender spawning it's own goroutine (https://github.com/ethereum/go-ethereum/blob/master/core/tx_pool.go#L1000) . Btw, @hadv , I don't know what code you're running, but the line numbers from your stack does not match up with what's on master now. I'm not sure if there are any simple solutions to this ticket, since IMO it would probably require a non-trivial rewrite of tx pool internals. |
okay, thank you for your information. For the code I'm adding some log to figure out the issue then the line of code might be difference with the master but the logic are the same though. |
it might want to refresh the pending state,so that the rpc-client can get the latest pending information,for example, nonce of account, and users don't have to maintain the information themselves |
that we can understand obviously but the mining and non-mining node use difference way to apply the pending tnx and the way of non-mining node make goroutine leak. |
I think a simple fix to try would be removing the |
Yeah, I'm afraid that in case Feed.Send() is blocked then whole txpool is blocked also |
by the way, I think we have enough information for the issue then could you remove the label |
Here's some data from a The first drop was when the node was upgraded from |
@fjl any more information do you need for this issue? |
This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have more relevant information or answers to our questions so that we can investigate further. |
@karalabe @fjl @holiman We already provide the details information. Please remove the inappropriate label and re-open this issue. Thank you! |
open new issue #19192 |
System information
Geth version:
1.8.12 stable
OS & Version: Linux 16.04
Expected behaviour
Memory usage stays constant
Actual behaviour
Hi @karalabe,
im running the node without the rpcapis, the node started 3 days ago using 1.9% of my RAM (8gb). Now is consuming 2.3% and it keeps increasing slowly (like 10mb / h).
I ran the node without specifing the --cache flag, so i assume it is using 1gb.
Is this something that i must worry about or maybe is related with the garbage collection?
Steps to reproduce the behaviour
I ran the node with this command:
geth --datadir e1/ --syncmode 'full' --port 30357 --rpc --rpcport 8545 --rpccorsdomain '*' --rpcaddr 'server_ip' --ws --wsaddr "server_ip" --wsorigins "some_ip" --wsport 9583 --wsapi 'db,eth,net,web3,txpool,miner' --networkid 21 --gasprice '1'
The text was updated successfully, but these errors were encountered: