Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasionally got "Block submitted via getwork does not meet the required proof of work" when submit a work #1566

Closed
YihaoPeng opened this issue Jan 10, 2019 · 13 comments

Comments

@YihaoPeng
Copy link

YihaoPeng commented Jan 10, 2019

In some cases, you may get an error when you submit a mined block via getwork RPC:

Block submitted via getwork does not meet the required proof of work: block hash of xxxxxx is higher than expected max of xxxxxx

For example, my application call getwork RPC with this params:

{"jsonrpc":"1.0","id":"1","method":"getwork","params":["0500000077d727e854739b8cdf84d763078ee7e15aa4757fc231ca3f00000000000000009ab4b84001f7ebc6ad9bc524b8ed00a8f16997463dcbf60f5693dfdfaf1f9ab3a392b0fe93d9d0d1b76c815e24c2f7724adee5921e5a22b2d373ceef6bf441690100176e5a3e0b36050000007ca0000072ce5218d9880487020000008eb30400ab1e00008092355c3259a782046d11000090959711faaa020000000000000000000000000000000000000000050000008000000100000000000005a0"]}

Then I got this as response:

{"jsonrpc":"1.0","result":false,"error":null,"id":"1"}

Then dcrd recorded this in its log:

dcrd.log:2019-01-09 06:20:06.423 [ERR] RPCS: Block submitted via getwork does not meet the required proof of work: block hash of 73bc321a8b2c6a8bd20059d13e64a7d7b271c0e07441e2b4ad52bdb634877191 is higher than expected max of 000000000000000052ce72000000000000000000000000000000000000000000

But when I compute the hash of the submission, I got 00000000000000b0d1016a2bcf905d1149dbead9396ee37e44075eb7eb5cb1dc. I done it by adding a new testcase to wire/blockheader_test.go:

func TestBlockHeaderHashing2(t *testing.T) {
	dummyHeader := "0500000077d727e854739b8cdf84d763078ee7e15aa4757fc231ca3f00000000000000009ab4b84001f7ebc6ad9bc524b8ed00a8f16997463dcbf60f5693dfdfaf1f9ab3a392b0fe93d9d0d1b76c815e24c2f7724adee5921e5a22b2d373ceef6bf441690100176e5a3e0b36050000007ca0000072ce5218d9880487020000008eb30400ab1e00008092355c3259a782046d11000090959711faaa02000000000000000000000000000000000000000005000000"
	// This hash has reversed endianness compared to what chainhash spits out.
	hashStr := "dcb15cebb75e07447ee36e39d9eadb49115d9cf2b6a01d1b0000000000000000"
	hashB, _ := hex.DecodeString(hashStr)
	hash, _ := chainhash.NewHash(hashB)

	vecH, _ := hex.DecodeString(dummyHeader)
	r := bytes.NewReader(vecH)
	var bh BlockHeader
	bh.Deserialize(r)
	hash2 := bh.BlockHash()

	if !hash2.IsEqual(hash) {
		t.Errorf("wrong block hash returned (want %v, got %v)", hash,
			hash2)
	}
}

It should be a valid block header with nonce, but dcrd rejected it.

This problem does not always happen, and in most cases everything is fine.
But it does reproduce stably. The problem occurred 8 times in two months (So I lost 8 mined blocks).

I have not noticed this issue until last week. So last week I upgraded dcrd from 1.2 to 1.4.0-rc1. But just yesterday, the problem reappeared.

I suspect that dcrd has some kind of race condition problem. The merkle root calculated when checking the difficulty is different from the work creation, which causes the block hash to completely mismatch.


About my deploy:
I have two dcrd running in docker, there are v1.2.0 at first (Dockerfile). Last week, I upgraded them to 1.4.0rc1 (Dockerfile). All of them binaries from https://github.com/decred/decred-binaries.

About the issue:
My pool submit about 20 blocks per day to two dcrd (about 10 blocks for each). Up to now, there have been 4 such problems with each dcrd (total 8 blocks). Each interval is as short as three days and as long as 12 days. It happened looks like completely random.

I use this pool code: https://github.com/btccom/btcpool/

@YihaoPeng
Copy link
Author

YihaoPeng commented Jan 10, 2019

I think I need a temporary fix to reduce the loss of mining reward. If anyone has any suggestions for mitigating or fixing the problem, please feel free to submit it to me.

The most important thing for me is that the problem can be mitigating and my block can be broadcasted to the network. I don't very care security or code quality (for me, there is not much loss in broadcasting an invalid block, but losing a valid block is a big loss in money.)

@davecgh
Copy link
Member

davecgh commented Jan 10, 2019

Thanks for the report. We'll dig in a bit to see if we can reproduce and/or spot the issue.

To clarify something though, I noticed you said you have two nodes. Are you sure that you are submitting the solution back to the specific node that provided the work? The reason I ask is because the way that getwork functions is that a block template is constructed by the mining code and then the associated header is handed out for the miner to solve and submit the solution. That means when the solution is presented, the node has to look up that associated template in order to reconstruct the full block. So, if you have received work from both node A and then submit that work back to node B (which also has previously handed out some work against the merkle root pair, but is not otherwise 100% identical to work handed out by node A), your solution for the work from node A would indeed be incorrect for the associated template in node B.

@YihaoPeng
Copy link
Author

In my implementation, I will submit to both nodes at the same time.

And this should be the response of the node that does not match the work:

dcrd.log:2019-01-09 06:20:06.417 [ERR] RPCS: Block submitted via getwork has no matching template for merkle root b39a1fafdfdf93560ff6cb3d469769f1a800edb824c59badc6ebf70140b8b49a

So the other is the matched node:

dcrd.log:2019-01-09 06:20:06.423 [ERR] RPCS: Block submitted via getwork does not meet the required proof of work: block hash of 73bc321a8b2c6a8bd20059d13e64a7d7b271c0e07441e2b4ad52bdb634877191 is higher than expected max of 000000000000000052ce72000000000000000000000000000000000000000000

@YihaoPeng
Copy link
Author

YihaoPeng commented Jan 11, 2019

In rpcserver.go:

	// Create the new merkleRootPair key which is MerkleRoot + StakeRoot
	var merkleRootPair [merkleRootPairSize]byte
	copy(merkleRootPair[:chainhash.HashSize], submittedHeader.MerkleRoot[:])
	copy(merkleRootPair[chainhash.HashSize:], submittedHeader.StakeRoot[:])

	// Look up the full block for the provided data based on the merkle
	// root.  Return false to indicate the solve failed if it's not
	// available.
	blockInfo, ok := s.templatePool[merkleRootPair]
	if !ok {
		rpcsLog.Errorf("Block submitted via getwork has no matching "+
			"template for merkle root %s",
			submittedHeader.MerkleRoot)
		return false, nil
	}

The program did not return here, proving that both MerkleRoot and StakeRoot in the submission match the node's work.

The following codes:

	// Reconstruct the block using the submitted header stored block info.
	// A temporary block is used because we will be mutating the contents
	// for the construction of the correct regular merkle tree. You must
	// also deep copy the block itself because it could be accessed outside
	// of the GW workstate mutexes once it gets submitted to the
	// blockchain.
	tempBlock := dcrutil.NewBlockDeepCopy(blockInfo.msgBlock)
	msgBlock := tempBlock.MsgBlock()
	msgBlock.Header = submittedHeader
	if msgBlock.Header.Height > 1 {
		pkScriptCopy := make([]byte, len(blockInfo.pkScript))
		copy(pkScriptCopy, blockInfo.pkScript)
		msgBlock.Transactions[0].TxOut[1].PkScript = blockInfo.pkScript
		merkles := blockchain.BuildMerkleTreeStore(tempBlock.Transactions())
		msgBlock.Header.MerkleRoot = *merkles[len(merkles)-1]
	}

	// The real block to submit, with a proper nonce and extraNonce.
	block := dcrutil.NewBlockDeepCopyCoinbase(msgBlock)

	// Ensure the submitted block hash is less than the target difficulty.
	err = blockchain.CheckProofOfWork(&block.MsgBlock().Header,
		activeNetParams.PowLimit)
	if err != nil {
		// Anything other than a rule violation is an unexpected error,
		// so return that error as an internal error.
		if _, ok := err.(blockchain.RuleError); !ok {
			return false, rpcInternalError("Unexpected error "+
				"while checking proof of work: "+err.Error(),
				"")
		}

		rpcsLog.Errorf("Block submitted via getwork does not meet "+
			"the required proof of work: %v", err)
		return false, nil
	}

I guess if no msgBlock.Header.MerkleRoot = *merkles[len(merkles)-1], the block hash will be right.
I will add a line to log the old and new merkle root.

@YihaoPeng
Copy link
Author

YihaoPeng commented Jan 11, 2019

Yesterday the problem appeared again:

submitblock request:

{"jsonrpc":"1.0","id":"1","method":"getwork","params":["050000007a97064f8f4c0d0a9bae925482673562d593a4797719ac340000000000000000817374720fb2413085b8a0457ab32b86a91eb73bd57d188d6a7df7f81331b822bed67ee4b32b150c1822b60f62098744777859e667ab79144286a2c0c46faec501008f61701b18be05000600f49f00005961461876a3138e020000001bb50400ef390000268d375cbfc2844a443c050000f429ce989766030000000000000000000000000000000000000000050000008000000100000000000005a0"]}

One node log (not matched, normal):

2019-01-10 18:21:43.077 [ERR] RPCS: Block submitted via getwork has no matching template for merkle root 22b83113f8f77d6a8d187dd53bb71ea9862bb37a45a0b8853041b20f72747381

The other log (matched, abnormal):

2019-01-10 18:21:43.082 [ERR] RPCS: Block submitted via getwork does not meet the required proof of work: block hash of 34eb58651921d521fa12712c5f8e800c155d1a7ba293371c05ef897440b4c671 is higher than expected max of 0000000000000000466159000000000000000000000000000000000000000000

But the hash should be 000000000000000001d96a9e7c56bc8b95e90e5f964774e2f970d82b31f304b1.

@dajohi
Copy link
Member

dajohi commented Jan 11, 2019

decred/gominer used to see that. does your code have this similar check?

https://github.com/decred/gominer/blob/master/device.go#L283

@YihaoPeng
Copy link
Author

YihaoPeng commented Jan 14, 2019

decred/gominer used to see that. does your code have this similar check?

https://github.com/decred/gominer/blob/master/device.go#L283

Yes, my code will compute the hash by itself.

For this submittion:

{"jsonrpc":"1.0","id":"1","method":"getwork","params":["050000007a97064f8f4c0d0a9bae925482673562d593a4797719ac340000000000000000817374720fb2413085b8a0457ab32b86a91eb73bd57d188d6a7df7f81331b822bed67ee4b32b150c1822b60f62098744777859e667ab79144286a2c0c46faec501008f61701b18be05000600f49f00005961461876a3138e020000001bb50400ef390000268d375cbfc2844a443c050000f429ce989766030000000000000000000000000000000000000000050000008000000100000000000005a0"]}

My code outputed its hash 000000000000000001d96a9e7c56bc8b95e90e5f964774e2f970d82b31f304b1.

And in dcrd testcase:

func TestBlockHeaderHashing(t *testing.T) {
	dummyHeader := "050000007a97064f8f4c0d0a9bae925482673562d593a4797719ac340000000000000000817374720fb2413085b8a0457ab32b86a91eb73bd57d188d6a7df7f81331b822bed67ee4b32b150c1822b60f62098744777859e667ab79144286a2c0c46faec501008f61701b18be05000600f49f00005961461876a3138e020000001bb50400ef390000268d375cbfc2844a443c050000f429ce989766030000000000000000000000000000000000000000050000008000000100000000000005a0"
	// This hash has reversed endianness compared to what chainhash spits out.
	hashStr := "b104f3312bd870f9e27447965f0ee9958bbc567c9e6ad9010000000000000000"
	hashB, _ := hex.DecodeString(hashStr)
	hash, _ := chainhash.NewHash(hashB)

	vecH, _ := hex.DecodeString(dummyHeader)
	r := bytes.NewReader(vecH)
	var bh BlockHeader
	bh.Deserialize(r)
	hash2 := bh.BlockHash()

	if !hash2.IsEqual(hash) {
		t.Errorf("wrong block hash returned (want %v, got %v)", hash,
			hash2)
	}
}

It past.

But I got this from dcrd:

2019-01-10 18:21:43.082 [ERR] RPCS: Block submitted via getwork does not meet the required proof of work: block hash of 34eb58651921d521fa12712c5f8e800c155d1a7ba293371c05ef897440b4c671 is higher than expected max of 0000000000000000466159000000000000000000000000000000000000000000

So this is obviously a problem in dcrd and not in my miner.

@dnldd
Copy link
Member

dnldd commented Jan 15, 2019

@YihaoPeng looking into it, will keep you posted.

@dnldd
Copy link
Member

dnldd commented Jan 16, 2019

@YihaoPeng build this pr from source #1567 and run your pool with it, it should resolve the issue.

@davecgh
Copy link
Member

davecgh commented Jan 18, 2019

Any updates on if #1567 resolves the issue as expected?

@YihaoPeng
Copy link
Author

Thank you very much. I will try the patch you provided.

And the following log comes from a unpatched dcrd v1.4.0-rc1 (without #1567), I only added a line of log (YihaoPeng@27705d4):

2019-01-19 05:09:26.000 [INF] RPCS: rpcMerkleRoot: f12e40119b7afe1013e153687c34cca9241b1cfe342537f63c9569106df5ecd1, oldMerkleRoot: f12e40119b7afe1013e153687c34cca9241b1cfe342537f63c9569106df5ecd1, newMerkleRoot: 8ce89c98a7bfc60f4f0f9de3a03cd20fbc67ceaa27118a748f28c64eff8fbcab
2019-01-19 05:09:26.000 [ERR] RPCS: Block submitted via getwork does not meet the required proof of work: block hash of c1f70a5f7a0976f4ebc5e7033f5a8fe61d98f028f344f911c686a6a86278eacb is higher than expected max of 00000000000000005574ff000000000000000000000000000000000000000000

It may be helpful in finding the cause of the bug.

@dajohi
Copy link
Member

dajohi commented Jan 21, 2019

@YihaoPeng Any rejections with the diff yet?

@davecgh
Copy link
Member

davecgh commented Jan 26, 2019

Resolved by #1567.

@davecgh davecgh closed this as completed Jan 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants