Skip to content
This repository has been archived by the owner on Feb 26, 2024. It is now read-only.

Truffle doesn't seem to detect mined transactions if insta-mining #624

Open
1 task
karalabe opened this issue Oct 17, 2017 · 25 comments
Open
1 task

Truffle doesn't seem to detect mined transactions if insta-mining #624

karalabe opened this issue Oct 17, 2017 · 25 comments

Comments

@karalabe
Copy link

karalabe commented Oct 17, 2017


Issue

We're experimenting with a new developer mode for go-ethereum (i.e. geth --dev), where block times are 0 (Clique PoA chain) and any transaction that enters into the pool gets mined as soon as possible into a block (300 microsecond order of magnitude).

When running truffle test, I've noticed that a single transaction is submitted, geth mines it instantly, but tuffle keeps waiting patiently for it to get mined. Since the "insta-miner" doesn't produce empty blocks, it refuses to mine anything until the next transaction arrives. However, truffle doesn't notice that the transaction was already mined.

I've tried to verify whether the issue is in truffle or not. When truffle "hangs" waiting for the transaction to mine, I can submit a random transaction into the node form the outside. When my own transaction ends up in a block, truffle immediately notices the transaction it sent in the previous block and continues.

Steps to Reproduce

Expected Behavior

I would expect truffle to properly detect when a transaction is mined and continue, even if it takes a fraction of a millisecond between submit and block creation.

Actual Results

My hunch is that truffle submits a transaction, and afterwards starts listening for a block to see if it was mined. But geth's new mining mode will produce the block faster, before truffle actually listens for new blocks. As such, this seems a data race between truffle's submit and "await" functionality.

Environment

  • Operating System: Linux
  • Truffle version: Truffle v3.4.11 (core: 3.4.11)
  • Ethereum client: Geth
  • node version: v8.6.0
  • npm version: 5.3.0

Ping @Arachnid, I think you saw a similar issue on your repo too.

@karalabe
Copy link
Author

Small update. If I inject a 100ms delay between receiving the transaction and mining it, truffle does correctly detect the block and continue, so this definitely seems to be a data race within truffle.

@gnidan
Copy link
Contributor

gnidan commented Oct 17, 2017

Thanks for reporting this! Sounds like this will require truffle-contract to be reworked slightly to account for this race condition. We will have to triage the work involved in fixing this, or PRs welcome!

@izqui
Copy link

izqui commented Jan 17, 2018

What is the status of this? We are trying to run our test suite on real nodes prior to launch and this issue is blocking us.

@cgewecke cgewecke self-assigned this Jan 17, 2018
@cgewecke
Copy link
Contributor

@karalabe @izqui

@gnidan and I looked at this issue this morning and were wondering if either of you could analyze it further with us. The relevant code at truffle is here. It's a callback passed to web3's myContract.myMethod.sendTransaction. @karalabe is correct that it patiently waits for the transaction to be mined, but shouldn't this callback be fired as soon as it's mined?

Have written a small node script for web3 that reproduces the Truffle sequence and am wondering if someone can run it against geth insta-mine to see if the issue is here or elsewhere? Or arrive at a clearer understanding of it anyway.... Thanks! Apologies for the delay addressing this.

#!/usr/bin/env node
const solc = require('solc');
const Web3 = require('web3');
const web3 = new Web3();

web3.setProvider(new web3.providers.HttpProvider('http://localhost:8545'));

// Compile
const source = `
  pragma solidity ^0.4.18;
  contract Simple {
    uint x;

    function set(uint val) public {
      x = val;
    }
  }`;

const input = {
  language: "Solidity",
  sources: { "Simple.sol": { content: source } },
  settings: { outputSelection: { "*": { "*": ["abi", "evm.bytecode.object"] } } }
};

let output = solc.compileStandard(JSON.stringify(input));
output = JSON.parse(output, null, ' ');
const abi = output.contracts["Simple.sol"]["Simple"].abi;
const bytecode = "0x" + output.contracts["Simple.sol"]["Simple"].evm.bytecode.object;

// Deploy
const Simple = web3.eth.contract(abi);
const deployment = Simple.new({from: web3.eth.coinbase, gas: 1000000, data: bytecode});
const deploymentReceipt = web3.eth.getTransactionReceipt(deployment.transactionHash);

// Set a value with sendTransaction
const instance = Simple.at(deploymentReceipt.contractAddress);

/*
  truffleMethod.set(5) is like the sequence below. On ganache which also instamines,
  the tx hash is immediately returned in the callback and it's possible to fetch it with
  getTransactionReceipt. On a real client Truffle retries the getTransactionReceipt step
  once a second for ~4 minutes (unless configured otherwise). The hang must come from
  that. On geth instamine, when is this callback fired?
 */
instance.set.sendTransaction(5, {from: web3.eth.coinbase, gas: 90000}, function(err, tx){
  const receipt = web3.eth.getTransactionReceipt(tx); 
})

@cpurta
Copy link

cpurta commented Jan 19, 2018

@cgewecke I just ran that script against a geth node running the private dev chain. Geth is correctly mining and sealing blocks as it should and that script ran without a problem. Using geth --dev --rpc to start the client.

@cgewecke
Copy link
Contributor

@cpurta Thank you that's helpful. Will be looking into this further. . . .

@cgewecke
Copy link
Contributor

@karalabe @cpurta

It looks possible that @karalabe's initial hunch about the cause of this problem is correct but the filter is being set at web3 here and here on contract creation. When I comment out the sendTransaction lines in the script above and modify it to target the new callbacks, e.g

// Deploy
const Simple = web3.eth.contract(abi);

Simple.new({from: web3.eth.coinbase, gas: 1000000, data: bytecode}, (err, instance) => {
  if (err) throw new Error(err);
  if (instance && !instance.address){
    console.log('first cb fires: txHash -->' + instance.transactionHash);
  } else {
    console.log('after filtering for `latest`: contract address --> ' + instance.address);
  }
});

geth hangs at the first log line. ganache-cli fires the callback twice as expected although not sure if this is the luck of latency or something else.

This shouldn't be an issue in web3 1.0 which uses EventEmitter instead of a double callback. Web3 1.0 is in Truffle's sights although it will likely come as a breaking release (possibly in the spring).

@karalabe Do you have any views about the correctness of the diagnosis above or how to proceed here? Some ideas:

  • geth introduces optional latency to be backward compatible with original web3
  • web3 inverts it's order, setting up the filter and running the tx within that callback.

Open to anything.

@cgewecke
Copy link
Contributor

NB: Merged PR from July 2017 fixing web3 1.0 to work with parity client's instant seal.

@cgewecke
Copy link
Contributor

@cpurta Revisted this and found the following: Using Truffle's example MetaCoin project, if I run geth with the dev.period option enabled:

geth --dev --dev.period 1 --rpc --targetgaslimit 0x6ACFC0 // <-- 7000000 gas

and a Truffle network defined as:

networks: {
    geth: {
      host: "127.0.0.1",
      port: 8545,
      network_id: "*",
      gas: 6300000,
    }
  }

then both truffle migrate --network geth and truffle test --network geth succeed.

Does enabling the dev.period flag present problems for your use case? Otherwise it seems like we might have overlooked something simple here.

@cpurta
Copy link

cpurta commented Jan 26, 2018

@cgewecke I just ran some tests against a geth dev node using the --dev.period flag and migration and tests worked perfectly. I was not aware of that flag in geth (guess I should look a bit more closely at the usage 😅). I think that resolves this issue then.

I am finding that I am running into the same issue using parity. Will be looking into if the same option applies there with their instant seal dev chain. If not it looks like that is something that parity can add before web3 1.0 is integrated with truffle.

Thank you so much for looking into this.

@cgewecke
Copy link
Contributor

cgewecke commented Jan 26, 2018

@cpurta Great! No problem at all, really enjoyed it - geth --dev is fast. Going to close here and track the Parity issue at #389.

[EDIT] Actually I take that back - it was fast running the metacoin tests, but I looked at Zeppelin and tests are failing (not sure why) / it seems to run at quarter speed as well.

@karalabe
Copy link
Author

Hey all, I'm not sure why this was closed. The issue still persists, blocks take 1 second to be detected by Truffle after they are mined by Geth.

@cgewecke cgewecke reopened this Aug 24, 2018
@cgewecke
Copy link
Contributor

Hi @karalabe apologies -

We've started running part of truffle's own test suite in CI with geth using --dev --dev.period 1 and the next version of Truffle (truffle@beta) should be able to run without the period flag because the underlying issue has been resolved in web3 1.0.

Will update our test scripts to verify that is case.

@cgewecke
Copy link
Contributor

@karalabe Small update here . . . have opened a PR to run truffle's integration and contract wrapper tests with insta-mine. With an http provider It works (and it's fast).

However, if web3 connects to the client over websockets, there are occasional non-deterministic failures I can't isolate the cause of. Do you have any advice about this? Do you believe the issue is on the web/truffle side or are there known problems with the websockets at Geth?

@matthieu-merlyn
Copy link

@karalabe @cgewecke @gnidan

Any updates on this issue? Thanks.

@stale
Copy link

stale bot commented Jan 4, 2019

Thank you for raising this issue! It has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you would like to keep this issue open, please respond with information about the current state of this problem.

@stale stale bot added the stale label Jan 4, 2019
@Arachnid
Copy link

Arachnid commented Jan 5, 2019

This isn't @Stale.

@stale
Copy link

stale bot commented Jan 5, 2019

Thanks for your response! This issue is no longer considered stale and someone from the Truffle team will try to respond as soon as they can.

@stale stale bot removed the stale label Jan 5, 2019
@gnidan
Copy link
Contributor

gnidan commented Jan 16, 2019

@Arachnid or anyone: is this still a problem on Truffle v5? We can look into this to see about resolving the race condition.

@Arachnid
Copy link

I'm not certain; if you're unsure if you've fixed it you can simply run any test suite against a geth instance with instant-mining enabled.

@onetom
Copy link

onetom commented Jan 18, 2019

I'm experiencing inexplicable delays talking to a geth --dev even when I'm interfacing with it via web3.js or ethers.io directly.
It just hangs for seconds...
I'm also seeing 😱 block lost entries in the geth log, which I haven't found a lot of info about, but as I understand it should only happen on a multi-node PoA chain, not on a solo geth --dev chain.

@stale
Copy link

stale bot commented May 18, 2019

Thank you for raising this issue! It has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you would like to keep this issue open, please respond with information about the current state of this problem.

@stale stale bot added the stale label May 18, 2019
@stale
Copy link

stale bot commented Jun 1, 2019

There has been no new activity on this issue since it was marked as stale 7 days ago, so it is being automatically closed. If you'd like help with this or a different problem, please open a new issue. Thanks!

@stale stale bot closed this as completed Jun 1, 2019
@gnidan gnidan reopened this Jun 1, 2019
@stale
Copy link

stale bot commented Jun 1, 2019

Thanks for your response! This issue is no longer considered stale and someone from the Truffle team will try to respond as soon as they can.

@stale stale bot removed the stale label Jun 1, 2019
@tinchoabbate
Copy link

Just ran into this issue, and still persists. My environment:

Truffle:

$ npx truffle version
Truffle v5.1.3 (core: 5.1.3)
Solidity - 0.5.7 (solc-js)
Node v11.8.0
Web3.js v1.2.2

Truffle config

module.exports = {
    networks: {
        geth: {
            host: "localhost",
            port: 8543,
            network_id: "*",
            gas: 6283185,
            websockets: true
        }
    },
    compilers: {
      solc: {
        version: "0.5.7",
      }
    }
  }

Geth in Docker:

Geth
Version: 1.9.9-stable
Git Commit: 017449971e1e9e220efcd97d3313a0e27f47003b
Git Commit Date: 20191206
Architecture: amd64
Protocol Versions: [64 63]
Go Version: go1.13.5
Operating System: linux
GOPATH=
GOROOT=/usr/local/go

I'm connecting to the Geth running in the Docker container over websockets. Setting --dev.period 1 solves the issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants