[indexer] Standalone synthetic ingestion #20270

lxfind · 2024-11-14T18:25:12Z

Description

Decouples ingestion and benchmarking, remove the benchmark related code from sui-indexer crate.
After this change, we will always first run synthetic ingestion to generate a workload, and then separately, run indexer to benchmark it.
This has a few benefits:

We no longer need to maintain compatibility between indexer and indexer-alt in terms of benchmark integration. This is good because it is a lot easier to benchmark indexer-alt, since it supports stopping at a specific checkpoint.
This will make it easier to use different types of ingestion workload.

Also cleaned up the ingestion generation code by allowing generation from checkpoint 0, and made sure that every checkpoint has the same number of transactions.

Test plan

Updated tests.

Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

vercel · 2024-11-14T18:25:17Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
sui-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Nov 20, 2024 6:26am

3 Skipped Deployments

Name	Status	Preview	Updated (UTC)
multisig-toolkit	⬜️ Ignored (Inspect)	Visit Preview	Nov 20, 2024 6:26am
sui-kiosk	⬜️ Ignored (Inspect)	Visit Preview	Nov 20, 2024 6:26am
sui-typescript-docs	⬜️ Ignored (Inspect)	Visit Preview	Nov 20, 2024 6:26am

wlmyng · 2024-11-20T00:44:43Z

crates/sui-synthetic-ingestion/src/synthetic_ingestion.rs

+        fs::remove_file(ingestion_dir.join("1.chk")).await.unwrap();
+        effects.created()[0].0
+    };
+    sim.override_next_checkpoint_number(starting_checkpoint);


ah so this is what will set the next checkpoint file to starting_checkpoint.chk

wlmyng · 2024-11-20T00:45:44Z

crates/sui-types/src/mock_checkpoint_builder.rs

+        println!(
+            "About to build checkpoint, size: {}",
+            self.transactions.len()
+        );


do we want this println still?

gegaowp

overall looks good, left a couple of minor comments.

gegaowp · 2024-11-20T01:14:17Z

crates/sui-synthetic-ingestion/src/synthetic_ingestion.rs

+            let bytes = tokio::fs::read(&path).await.unwrap();
+            let checkpoint_data: CheckpointData = Blob::from_bytes(&bytes).unwrap();
+            if checkpoint_data.transactions.len() != checkpoint_size as usize {
+                for tx in &checkpoint_data.transactions {


why do we want to print here, as there is another assert below?

assert_eq!(checkpoint_data.transactions.len(), checkpoint_size as usize);

gegaowp · 2024-11-20T01:32:11Z

crates/sui-synthetic-ingestion/src/synthetic_ingestion.rs

+        // `request_gas` will create a transaction, which we don't want to include in the benchmark.
+        // Put it in a checkpoint and then remove the checkpoint file.
+        sim.create_checkpoint();
+        fs::remove_file(ingestion_dir.join("1.chk")).await.unwrap();


is remove_file necessary? as std::fs::write(path.join(file_name), blob.to_bytes())?; will overwrite 0.chk and 1.chk after override_next_checkpoint_number

if the test range covers cp 0 and cp 1, it will overwrite

otherwise these 2 files are not used?

I was hoping to have a mode where when you run benchmark from an ingestion dir, it automatically scan through all files in it

lxfind temporarily deployed to sui-typescript-aws-kms-test-env November 14, 2024 18:25 — with GitHub Actions Inactive

lxfind requested review from wlmyng, gegaowp and amnn November 14, 2024 18:30

lxfind force-pushed the standalone-synthetic-ingestion branch from 9be316b to 895565e Compare November 15, 2024 18:38

lxfind temporarily deployed to sui-typescript-aws-kms-test-env November 15, 2024 18:38 — with GitHub Actions Inactive

vercel bot deployed to Preview – sui-docs November 15, 2024 18:39 View deployment

[indexer] Standalone synthetic ingestion

a93b934

lxfind force-pushed the standalone-synthetic-ingestion branch from 895565e to a93b934 Compare November 19, 2024 23:46

lxfind temporarily deployed to sui-typescript-aws-kms-test-env November 19, 2024 23:46 — with GitHub Actions Inactive

vercel bot deployed to Preview – sui-docs November 19, 2024 23:48 View deployment

wlmyng approved these changes Nov 20, 2024

View reviewed changes

gegaowp approved these changes Nov 20, 2024

View reviewed changes

feedback

49593b8

lxfind temporarily deployed to sui-typescript-aws-kms-test-env November 20, 2024 06:24 — with GitHub Actions Inactive

lxfind enabled auto-merge (squash) November 20, 2024 06:24

vercel bot deployed to Preview – sui-docs November 20, 2024 06:26 View deployment

lxfind merged commit 0241db6 into main Nov 20, 2024
52 checks passed

lxfind deleted the standalone-synthetic-ingestion branch November 20, 2024 06:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[indexer] Standalone synthetic ingestion #20270

[indexer] Standalone synthetic ingestion #20270

lxfind commented Nov 14, 2024 •

edited

Loading

vercel bot commented Nov 14, 2024 •

edited

Loading

wlmyng Nov 20, 2024

wlmyng Nov 20, 2024

gegaowp left a comment

gegaowp Nov 20, 2024

gegaowp Nov 20, 2024

lxfind Nov 20, 2024

[indexer] Standalone synthetic ingestion #20270

[indexer] Standalone synthetic ingestion #20270

Conversation

lxfind commented Nov 14, 2024 • edited Loading

Description

Test plan

Updated tests.

Release notes

vercel bot commented Nov 14, 2024 • edited Loading

wlmyng Nov 20, 2024

Choose a reason for hiding this comment

wlmyng Nov 20, 2024

Choose a reason for hiding this comment

gegaowp left a comment

Choose a reason for hiding this comment

gegaowp Nov 20, 2024

Choose a reason for hiding this comment

gegaowp Nov 20, 2024

Choose a reason for hiding this comment

lxfind Nov 20, 2024

Choose a reason for hiding this comment

lxfind commented Nov 14, 2024 •

edited

Loading

vercel bot commented Nov 14, 2024 •

edited

Loading