Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blocks sometimes are with Synthetic transactions missing. #1599

Closed
AlfredoG87 opened this issue Aug 3, 2023 · 4 comments · Fixed by #1729
Closed

Blocks sometimes are with Synthetic transactions missing. #1599

AlfredoG87 opened this issue Aug 3, 2023 · 4 comments · Fixed by #1729
Assignees
Labels
bug Something isn't working P2
Milestone

Comments

@AlfredoG87
Copy link
Collaborator

Description

When doing some tests on TheGraph longstanding node against mainnet, I decided to index an HTS token with fairly high activity, so I choose SAUCER Token of SaucerSwap.

Address: 0x00000000000000000000000000000000000b2ad5
startBlock: 38629934

I choose that starting block for the indexing since it was the first interaction found on Hashscan for that token. see here.

Indexing went good until suddenly stopped at block: number: 0x30e597a or hash: 0xe0d8abbb5811629eac5c1f549b8013394a73c042f752bcf76f277f5b88a5b30c

When seeing the status of the subgraph, I found a fatalError with a message something like this:

** I did not copy the original block error, but then there were many more blocks with same issue, so I started copying some for reference.

{
    "message": "failed to process trigger: block #51390342 (0x018f…6c4b), transaction a179e7dd235f25e3df57a6c9346a1b1d806f87a546bec51610d2836aeb954ad2: Found no transaction for event",
    "block": {
    "number": "51390342",
    "hash": "0x018f787384b920cadcd35800363c045898ecad63bacb952b531b367d846c6c4b"
    },
    "handler": null
}

But to my surprise, when I fetched the block using postman I did not encounter the issue, I was able to find said transaction (synthetic) among the transactions for that block.

So I found that there was a way to delete the cache for a given block hash or number using a troubleshooting tool for the graph-node called: graphman. This tool resides within the same instance where the graph-node is running, so a remote exec was needed to run it, and the command is something like this:

graphman --config config.toml chain check-blocks mainnet by-number 51442539

or by hash:

graphman --config config.toml chain check-blocks mainnet by-hash 0xe0d8abbb5811629eac5c1f549b8013394a73c042f752bcf76f277f5b88a5b30c

if inconsistencies are found a message like this will appear and the cache will be deleted forcing the graph-node to refetch the block and resolved the indexing and continuing to index the subgraph

root@graph-node-85b68c875c-lhm4n:/home# graphman --config config.toml chain check-blocks mainnet by-number 51453737
Aug 01 21:44:54.702 INFO Graph Node version: 0.28.2 (2022-10-11)
Aug 01 21:44:54.703 INFO Reading configuration file `config.toml`
Aug 01 21:44:54.711 INFO Creating transport, capabilities: archive, traces, url: https://mainnet.hashio.io/api, provider: mainnet
Aug 01 21:44:54.872 INFO Connecting to Postgres, weight: 1, conn_pool_size: 3, url: postgresql://graph-node:HIDDEN_PASSWORD@postgres:5432/graph-node, pool: main, shard: primary
Aug 01 21:44:54.881 INFO Pool successfully connected to Postgres, pool: main, shard: primary, component: Store
Aug 01 21:44:54.882 DEBG Using postgres host order [Main], shard: primary, component: Store
block 0xc95a…7c6a diverges from cache:
 {
   transactions: [
+    {
+      blockHash: "0xc95a992922553b0cccf92ee677426442f9a2410258eb8e67c41d86be5bee7c6a"
+      blockNumber: "0x3111f29"
+      from: "0x00000000000000000000000000000000000b2ad5"
+      gas: "0x61a80"
+      gasPrice: "0xfe"
+      hash: "0x1770c0cc82f54b5e400cf4af296ad079876a18ea28850b4b56dc712f75bdef44"
+      input: "0x0000000000000000"
+      maxFeePerGas: "0x0"
+      maxPriorityFeePerGas: "0x0"
+      nonce: "0x0"
+      r: "0x0"
+      s: "0x0"
+      to: "0x00000000000000000000000000000000000b2ad5"
+      transactionIndex: "0x16f"
+      type: "0x2"
+      v: "0x0"
+      value: "0x1234"
+    }
   ]
-  transactionsRoot: "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421"
+  transactionsRoot: "0xc95a992922553b0cccf92ee677426442f9a2410258eb8e67c41d86be5bee7c6a"
 }

Deleting block 0xc95a…7c6a from cache.
Done.

And after a few minutes you can see that the indexing status of the subgraph is continuing and is no longer stuck.

So this happened around 20 times between blocks: 38629934 and 51532212 (start to end)

I documented the need to check the cached block for the following block instances, but there were a few more:

51442539
51434203
51403767
51400371
51276784
51278825
51300058
51349773
51370110
51375513
51390342
51393614

What I think is happening:

graph-node is requesting logs for a range of blocks for that given address.
logs re returned with reference to the given block.

block is fetched using eth_getBlockByHash or eth_getBlockByNumber and the expected transaction is not found there, hence failing.

Steps to reproduce

  1. Deploy a graph-node connected to RPC on Hedera Mainnet
  2. Deploy a sub-graph with the above parameters
  3. issue might appear on random blocks

Additional context

No response

Hedera network

mainnet

Version

v0.28.0

Operating system

Other

@AlfredoG87 AlfredoG87 added the bug Something isn't working label Aug 3, 2023
@AlfredoG87 AlfredoG87 added the P2 label Aug 3, 2023
@AlfredoG87 AlfredoG87 changed the title Blocks Synthetic transactions sometimes are missing. Blocks sometimes are with Synthetic transactions missing. Aug 3, 2023
@AlfredoG87
Copy link
Collaborator Author

I also had to manually check / re-fetch the following blocks today:

51714199
51717311
51719847
51722690
51723810
51733040
51736127
51742565
51743987
51746654
51747696
51749251
51749386
51758028
51778725
51787727
51790240
51794159

@AlfredoG87
Copy link
Collaborator Author

@georgi-l95 any ideas on this issue?

@georgi-l95
Copy link
Collaborator

Is it possible that because we have a different cache for every hashio instance, when the graph gets the block using getBlockByNumber/Hash it's saving the synthetic transaction in the cache, but later when it requests the receipt it asks another hashio instance, which does not have this synthetic transaction in the cache ?

@AlfredoG87
Copy link
Collaborator Author

AlfredoG87 commented Aug 10, 2023

@georgi-l95, That is exactly my hypothesis as well, since that explains why it resolves by itself by re-fetching the block data.

Trying to make some local tests in order to validate the theory.

Also, I opened this other ticket to keep track of it, but I believe is due to the same reason that we lack distributed shared cache. #1627

@Nana-EC Nana-EC moved this to Sprint BackLog in Smart Contract Sprint Board Aug 21, 2023
@AlfredoG87 AlfredoG87 moved this from Sprint BackLog to In Progress in Smart Contract Sprint Board Sep 6, 2023
@AlfredoG87 AlfredoG87 moved this from In Progress to In Review in Smart Contract Sprint Board Sep 9, 2023
@Nana-EC Nana-EC added this to the 0.32.0 milestone Sep 13, 2023
@github-project-automation github-project-automation bot moved this from In Review to Done in Smart Contract Sprint Board Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P2
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants