Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Petera/handle resourceexhaused ingestion #686

Merged
merged 6 commits into from
Jan 13, 2025

Conversation

peterargue
Copy link
Contributor

@peterargue peterargue commented Nov 28, 2024

Closes: #???

Description


For contributor use:

  • Targeted PR against master branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the standards mentioned here.
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

Summary by CodeRabbit

  • New Features

    • Added retry mechanism for gRPC client handling resource exhaustion errors.
    • Introduced methods to stop database and client connections gracefully.
  • Bug Fixes

    • Enhanced error handling for database initialization and storage setup.
  • Chores

    • Improved logging for engine startup process.
    • Added configuration constants for message size and retry delays.
  • Tests

    • Added comprehensive test coverage for retry interceptor functionality, simulating various scenarios.

Copy link
Contributor

coderabbitai bot commented Nov 28, 2024

Walkthrough

The changes in the bootstrap package focus on enhancing gRPC client configuration and error handling. New constants are introduced to define message size and retry parameters for resource-exhausted scenarios. A retry interceptor is implemented to manage gRPC client interactions, with improved error handling and logging. The Bootstrap struct gains new methods for stopping database and client connections, and the setupStorage function receives updates to better handle database initialization and cadence height management.

Changes

File Changes
bootstrap/bootstrap.go - Added constants for message size and retry delays
- Implemented retryInterceptor function
- Updated setupCrossSporkClient with retry configuration
- Enhanced setupStorage error handling
- Added StopDB() and StopClient() methods to Bootstrap struct
bootstrap/bootstrap_test.go - Added TestRetryInterceptor to validate retry interceptor behavior

Sequence Diagram

sequenceDiagram
    participant Client
    participant RetryInterceptor
    participant gRPCService
    
    Client->>RetryInterceptor: Make gRPC Request
    RetryInterceptor->>gRPCService: Initial Request
    alt Request Fails
        RetryInterceptor->>RetryInterceptor: Wait and Retry
        RetryInterceptor->>gRPCService: Retry Request
    end
    alt Max Retry Time Exceeded
        RetryInterceptor-->>Client: Return Error
    else Request Succeeds
        gRPCService-->>Client: Return Response
    end
Loading

Poem

🐰 In the realm of bootstrap's grace,
Retry magic leaves no trace
Clients dance with gentle might
Errors fade, connections bright
Code hops forward, smooth and clean! 🚀


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@peterargue peterargue changed the base branch from main to feature/local-tx-reexecution November 28, 2024 17:27
Copy link
Collaborator

@m-Peter m-Peter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 🚀

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments.

🛑 Comments failed to post (63)
services/requester/utils.go (1)

11-12: ⚠️ Potential issue

Add input validation and error handling.

The function should validate its input parameters and handle potential errors.

Consider this improvement:

-func replaceAddresses(script []byte, chainID flow.ChainID) []byte {
+func replaceAddresses(script []byte, chainID flow.ChainID) ([]byte, error) {
+    if len(script) == 0 {
+        return nil, fmt.Errorf("empty script")
+    }

Committable suggestion skipped: line range outside the PR's diff.

tests/web3js/eth_rate_limit_test.js (2)

5-6: ⚠️ Potential issue

Increase the process exit timeout

The current 5-second timeout might be insufficient for processing 1000 requests plus the 1-second delay between batches. Consider increasing it to at least 15 seconds to prevent premature exits.

-    setTimeout(() => process.exit(0), 5000) // make sure the process exits
+    setTimeout(() => process.exit(0), 15000) // make sure the process exits
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    this.timeout(0)
    setTimeout(() => process.exit(0), 15000) // make sure the process exits

28-29: 🛠️ Refactor suggestion

Make assertions more resilient

The strict equality assertions might make the test flaky due to timing and network conditions. Consider using range assertions instead.

-    assert.equal(requestsMade, requestLimit, 'more requests made than the limit')
-    assert.equal(requestsFailed, requests - requestLimit, 'failed requests don\'t match expected value')
+    assert.isAtMost(requestsMade, requestLimit, 'more requests made than the limit')
+    assert.approximately(requestsFailed, requests - requestLimit, 5, 'failed requests significantly deviate from expected value')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    assert.isAtMost(requestsMade, requestLimit, 'more requests made than the limit')
    assert.approximately(requestsFailed, requests - requestLimit, 5, 'failed requests significantly deviate from expected value')
storage/pebble/db.go (1)

12-13: 🛠️ Refactor suggestion

Consider making cache size configurable

The hardcoded 1MB cache size (1 << 20) seems quite small for a production database. A small cache could lead to increased disk I/O and potential performance issues under load.

Consider:

  1. Making the cache size configurable through environment variables or configuration
  2. Setting a more reasonable default size based on system resources
  3. Adding documentation about cache size implications
+// DefaultCacheSize is the default size of the block cache (64MB)
+const DefaultCacheSize = 64 << 20
+
 // OpenDB opens a new pebble database at the provided directory.
-func OpenDB(dir string) (*pebble.DB, error) {
+func OpenDB(dir string, opts ...Option) (*pebble.DB, error) {
-	cache := pebble.NewCache(1 << 20)
+	cache := pebble.NewCache(DefaultCacheSize)

Committable suggestion skipped: line range outside the PR's diff.

tests/web3js/eth_get_storage_at_test.js (1)

38-50: 🛠️ Refactor suggestion

⚠️ Potential issue

Fix variable declaration and consider refactoring repeated code.

  1. The variable newValue on line 39 is missing let or const declaration, which could lead to unexpected behavior
  2. Consider extracting the contract value update logic into a helper function since it's used multiple times
- newValue = 100
+ const newValue = 100

Consider adding a helper function:

async function updateContractValue(contract, address, value) {
    const updateData = contract.methods.store(value).encodeABI()
    const res = await helpers.signAndSend({
        from: conf.eoa.address,
        to: address,
        data: updateData,
        value: '0',
        gasPrice: conf.minGasPrice,
    })
    assert.equal(res.receipt.status, conf.successStatus)
    return res
}
tests/web3js/eth_revert_reason_test.js (2)

80-93: 🛠️ Refactor suggestion

Refactor duplicated polling logic into a helper function

This polling logic is duplicated from the previous implementation. To improve maintainability and reduce code duplication, consider extracting it into a reusable helper function.

Create a helper function in helpers.js:

// Add to helpers.js
async function waitForTransactionReceipt(txHash, maxRetries = 30, interval = 1000) {
    let receipt = null;
    let attempts = 0;
    
    while (receipt == null && attempts < maxRetries) {
        const response = await callRPCMethod(
            'eth_getTransactionReceipt',
            [txHash]
        );
        receipt = response.body.result;
        
        if (receipt == null) {
            attempts++;
            await new Promise(resolve => setTimeout(resolve, interval));
        }
    }
    
    if (receipt == null) {
        throw new Error('Transaction receipt not found after maximum retries');
    }
    
    return { body: { result: receipt } };
}

Then simplify both polling sections to:

-   let rcp = null
-   while (rcp == null) {
-       rcp = await helpers.callRPCMethod(
-           'eth_getTransactionReceipt',
-           [txHash]
-       )
-       if (rcp.body.result == null) {
-           rcp = null
-       }
-   }
+   let rcp = await helpers.waitForTransactionReceipt(txHash)

45-56: ⚠️ Potential issue

Add timeout and delay to prevent resource exhaustion

The polling mechanism could potentially lead to resource exhaustion issues:

  1. No timeout mechanism could result in infinite loops
  2. Continuous polling without delays could overwhelm the node

Consider implementing this improved version:

    let rcp = null
+   const MAX_RETRIES = 30
+   const POLLING_INTERVAL = 1000 // ms
+   let attempts = 0
    // wait until the transaction is executed & indexed, and its
    // receipt becomes available.
-   while (rcp == null) {
+   while (rcp == null && attempts < MAX_RETRIES) {
        rcp = await helpers.callRPCMethod(
            'eth_getTransactionReceipt',
            [txHash]
        )
        if (rcp.body.result == null) {
            rcp = null
+           attempts++
+           await new Promise(resolve => setTimeout(resolve, POLLING_INTERVAL))
        }
    }
+   if (rcp == null) {
+       throw new Error('Transaction receipt not found after maximum retries')
+   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    let rcp = null
    const MAX_RETRIES = 30
    const POLLING_INTERVAL = 1000 // ms
    let attempts = 0
    // wait until the transaction is executed & indexed, and its
    // receipt becomes available.
    while (rcp == null && attempts < MAX_RETRIES) {
        rcp = await helpers.callRPCMethod(
            'eth_getTransactionReceipt',
            [txHash]
        )
        if (rcp.body.result == null) {
            rcp = null
            attempts++
            await new Promise(resolve => setTimeout(resolve, POLLING_INTERVAL))
        }
    }
    if (rcp == null) {
        throw new Error('Transaction receipt not found after maximum retries')
    }
storage/pebble/register_storage_test.go (1)

93-160: 🛠️ Refactor suggestion

Add test cases for error handling in storage getter.

The current tests don't verify the behavior when the storage getter function returns an error.

Consider adding a test case like this:

runDB("should handle storage getter errors", t, func(t *testing.T, db *Storage) {
    t.Parallel()

    storageGet := func(id flow.RegisterID, height uint64) (flow.RegisterValue, error) {
        return nil, fmt.Errorf("storage error")
    }

    s := NewStorageSnapshot(storageGet, 0)

    v, err := s.GetValue(owner, key)
    require.Error(t, err)
    require.Empty(t, v)

    // Verify the error is not cached
    count := 0
    storageGet = func(id flow.RegisterID, height uint64) (flow.RegisterValue, error) {
        count++
        return []byte{1}, nil
    }
    s = NewStorageSnapshot(storageGet, 0)
    v, err = s.GetValue(owner, key)
    require.NoError(t, err)
    require.Equal(t, []byte{1}, v)
    require.Equal(t, 1, count)
})
tests/integration_test.go (1)

63-74: 🛠️ Refactor suggestion

Consider adding resource exhaustion test scenarios

Given that this PR focuses on handling resource exhaustion during ingestion, consider enhancing this test configuration to:

  1. Test with non-zero gas prices to simulate real-world conditions
  2. Add test cases for resource exhaustion scenarios (e.g., memory limits, transaction queue limits)
  3. Verify proper error handling when resources are exhausted
go.mod (1)

198-202: 💡 Codebase verification

Security packages need version updates

The current versions in go.mod are not the latest available:

  • golang.org/x/crypto v0.28.0 → v0.29.0 available
  • golang.org/x/sys v0.26.0 → v0.27.0 available
  • golang.org/x/text v0.19.0 → v0.20.0 available

Consider updating to the latest stable versions for improved security and bug fixes.

🔗 Analysis chain

Security-related package updates

The updates to core security packages (crypto, sys, text) are good security practices. However, let's verify if these versions are the latest stable releases.


Since we can't use go commands in the sandbox environment, let's try to verify the versions against the official Go proxy index:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if we're using the latest stable versions of security-critical packages
echo "Checking latest versions of security packages:"
go list -m -versions golang.org/x/crypto | tail -n 1
go list -m -versions golang.org/x/sys | tail -n 1
go list -m -versions golang.org/x/text | tail -n 1

Length of output: 541


Script:

#!/bin/bash
# Query the Go proxy index for latest versions
echo "Checking golang.org/x/crypto versions:"
curl -s "https://proxy.golang.org/golang.org/x/crypto/@v/list" | sort -V | tail -n 5

echo -e "\nChecking golang.org/x/sys versions:"
curl -s "https://proxy.golang.org/golang.org/x/sys/@v/list" | sort -V | tail -n 5

echo -e "\nChecking golang.org/x/text versions:"
curl -s "https://proxy.golang.org/golang.org/x/text/@v/list" | sort -V | tail -n 5

Length of output: 632

tests/helpers.go (1)

141-158: 💡 Codebase verification

Rate limit increase requires careful consideration

The rate limit change from 50 to 500 requests/second in the test configuration requires attention:

  • The default production rate limit is 50 req/s (from cmd/run/cmd.go)
  • The README documents the default as 50 req/s for client requests
  • The test eth_rate_limit_test.js specifically tests rate limiting behavior
  • The change represents a 10x increase which could mask potential rate-limiting issues

Consider:

  • Keeping the test rate limit aligned with production defaults (50)
  • Or documenting why tests need the increased limit
  • Ensuring rate limit tests still effectively validate the rate limiting behavior
🔗 Analysis chain

LGTM! Verify rate limit increase impact

The configuration changes look good, particularly the increased rate limit which should help handle resource exhaustion. The use of t.TempDir() for the database directory is a good testing practice.

Let's verify the rate limit change impact:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any rate limit related configurations or constants in the codebase
# to ensure consistency and identify potential impacts

rg -g '!{vendor,node_modules}/**' -A 3 'RateLimit.*=.*[0-9]+'

Length of output: 63


Script:

#!/bin/bash
# Let's try a broader search for rate limit related code
rg -i "ratelimit|rate.*limit" -g '!{vendor,node_modules}/**'

# Also search for the specific config struct field
ast-grep --pattern 'RateLimit $$$'

Length of output: 5800

storage/pebble/storage.go (2)

19-23: ⚠️ Potential issue

Add nil check for db in the New function

The New function does not check if the provided db is nil. If db is nil, it could lead to nil pointer dereferences in methods that use s.db. Please add a nil check to prevent potential runtime errors.

Apply this diff to add the nil check:

func New(db *pebble.DB, log zerolog.Logger) *Storage {
+	if db == nil {
+		return nil
+	}
	return &Storage{
		db:  db,
		log: log,
	}
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

func New(db *pebble.DB, log zerolog.Logger) *Storage {
	if db == nil {
		return nil
	}
	return &Storage{
		db:  db,
		log: log,
	}

31-33: ⚠️ Potential issue

Ensure batch is not nil before using it

In the set method, batch is used without checking if it's nil. If batch is nil, calling batch.Set will result in a nil pointer dereference. Consider adding a check or enforcing that batch must not be nil.

Apply this diff to add a nil check for batch:

func (s *Storage) set(keyCode byte, key []byte, value []byte, batch *pebble.Batch) error {
	prefixedKey := makePrefix(keyCode, key)
+	if batch == nil {
+		return errors.New("batch cannot be nil")
+	}
	return batch.Set(prefixedKey, value, nil)
}

Alternatively, update the method's documentation to specify that batch must not be nil.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

	prefixedKey := makePrefix(keyCode, key)
	if batch == nil {
		return errors.New("batch cannot be nil")
	}
	return batch.Set(prefixedKey, value, nil)
services/replayer/blocks_provider.go (3)

28-38: ⚠️ Potential issue

Handle errors when retrieving block hashes

In the anonymous function that retrieves block hashes, errors are being silently ignored by returning an empty hash. This could lead to unexpected behavior downstream if the hash is expected to be valid.

Consider modifying the function to return an error when a block cannot be retrieved or its hash cannot be computed. This allows for proper error handling and prevents silent failures.

 func (bs *blockSnapshot) BlockContext() (evmTypes.BlockContext, error) {
 	return blocks.NewBlockContext(
 		bs.chainID,
 		bs.block.Height,
 		bs.block.Timestamp,
-		func(n uint64) gethCommon.Hash {
+		func(n uint64) (gethCommon.Hash, error) {
 			block, err := bs.blocks.GetByHeight(n)
 			if err != nil {
-				return gethCommon.Hash{}
+				return gethCommon.Hash{}, err
 			}
 			blockHash, err := block.Hash()
 			if err != nil {
-				return gethCommon.Hash{}
+				return gethCommon.Hash{}, err
 			}

-			return blockHash
+			return blockHash, nil
 		},
 		bs.block.PrevRandao,
 		bs.tracer,
 	)
 }

This change will require updating blocks.NewBlockContext to handle the error returned by the hash function.

Committable suggestion skipped: line range outside the PR's diff.


22-42: 🛠️ Refactor suggestion

Propagate errors from BlockContext

The BlockContext method currently does not handle errors returned by blocks.NewBlockContext, which might occur due to issues in the block data or parameters.

Consider checking and propagating errors returned by blocks.NewBlockContext to ensure any issues are caught and handled appropriately.

 func (bs *blockSnapshot) BlockContext() (evmTypes.BlockContext, error) {
-	return blocks.NewBlockContext(
+	blockCtx, err := blocks.NewBlockContext(
 		bs.chainID,
 		bs.block.Height,
 		bs.block.Timestamp,
 		func(n uint64) gethCommon.Hash {
 			// existing code...
 		},
 		bs.block.PrevRandao,
 		bs.tracer,
 	)
+	if err != nil {
+		return nil, err
+	}
+	return blockCtx, nil
 }

Committable suggestion skipped: line range outside the PR's diff.


65-76: ⚠️ Potential issue

Ensure thread safety of BlocksProvider

The BlocksProvider maintains latestBlock which is modified in OnBlockReceived and read in GetSnapshotAt. If BlocksProvider is accessed by multiple goroutines, this could lead to data races.

Consider adding synchronization mechanisms to protect access to latestBlock. Using a sync.RWMutex can ensure that reads and writes are properly synchronized.

+import (
+	"fmt"
+	"sync"
+
+	"github.com/onflow/flow-evm-gateway/models"
+	// other imports...
+)

 type BlocksProvider struct {
 	blocks      storage.BlockIndexer
 	chainID     flowGo.ChainID
 	tracer      *tracers.Tracer
 	latestBlock *models.Block
+	mu          sync.RWMutex
 }

 func (bp *BlocksProvider) OnBlockReceived(block *models.Block) error {
+	bp.mu.Lock()
 	if bp.latestBlock != nil && bp.latestBlock.Height != (block.Height-1) {
+		bp.mu.Unlock()
 		return fmt.Errorf(
 			"%w: received new block: %d, non-sequential of latest block: %d",
 			models.ErrInvalidHeight,
 			block.Height,
 			bp.latestBlock.Height,
 		)
 	}
 
 	bp.latestBlock = block
+	bp.mu.Unlock()

 	return nil
 }

 func (bp *BlocksProvider) GetSnapshotAt(height uint64) (
 	evmTypes.BlockSnapshot,
 	error,
 ) {
+	bp.mu.RLock()
 	if bp.latestBlock != nil && bp.latestBlock.Height == height {
 		snapshot := &blockSnapshot{
 			BlocksProvider: bp,
 			block:          *bp.latestBlock,
 		}
+		bp.mu.RUnlock()
 		return snapshot, nil
 	}
+	bp.mu.RUnlock()

 	block, err := bp.blocks.GetByHeight(height)
 	if err != nil {
 		return nil, err
 	}

 	return &blockSnapshot{
 		BlocksProvider: bp,
 		block:          *block,
 	}, nil
 }

Committable suggestion skipped: line range outside the PR's diff.

services/requester/remote_cadence_arch.go (1)

29-34: ⚠️ Potential issue

Potential Data Race on cachedCalls Map

The cachedCalls map is accessed and modified without synchronization. If instances of RemoteCadenceArch are used concurrently, this could lead to data races and undefined behavior.

Apply this diff to add synchronization using a mutex:

+import (
+	"sync"
+)

 type RemoteCadenceArch struct {
 	blockHeight uint64
 	client      *CrossSporkClient
 	chainID     flow.ChainID
 	cachedCalls map[string]evmTypes.Data
+	mu          sync.RWMutex
 }

Update the Run method to use the mutex:

 func (rca *RemoteCadenceArch) Run(input []byte) ([]byte, error) {
 	key := hex.EncodeToString(crypto.Keccak256(input))

+	rca.mu.RLock()
 	if result, ok := rca.cachedCalls[key]; ok {
+		rca.mu.RUnlock()
 		return result, nil
 	}
+	rca.mu.RUnlock()

 	evmResult, err := rca.runCall(input)
 	if err != nil {
 		return nil, err
 	}

+	rca.mu.Lock()
 	rca.cachedCalls[key] = evmResult.ReturnedData
+	rca.mu.Unlock()

 	return evmResult.ReturnedData, nil
 }

Ensure all other accesses to cachedCalls are properly synchronized.

Committable suggestion skipped: line range outside the PR's diff.

services/evm/executor.go (2)

128-129: ⚠️ Potential issue

Correct block height comparison for EVM BLOCKHASH opcode

The EVM specification states that BLOCKHASH should return zero when the block number is not in the range [current_block_number - 256, current_block_number - 1]. The current comparison excludes the block at current_block_number - 256. Adjust the condition to include it.

Apply this diff to include the block at current_block_number - 256:

     // If the given block height is more than 256 blocks
     // in the past, return an empty block hash.
-    if s.block.Height-n > 256 {
+    if s.block.Height-n >= 256 {
         return common.Hash{}
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

			if s.block.Height-n >= 256 {
				return common.Hash{}

93-95: ⚠️ Potential issue

Prevent potential nil pointer dereference in res.Invalid()

Before calling res.Invalid(), ensure that res is not nil to avoid a runtime panic if bv.DirectCall() or bv.RunTransaction() return a nil result without an error.

Apply this diff to check if res is nil:

     if err != nil {
         return err
     }

+    if res == nil {
+        return fmt.Errorf("nil result for transaction %s", tx.Hash())
+    }

     // we should never produce invalid transaction, since if the transaction was emitted from the evm core
     // it must have either been successful or failed, invalid transactions are not emitted
     if res.Invalid() {

Committable suggestion skipped: line range outside the PR's diff.

services/replayer/call_tracer_collector.go (2)

95-112: 🛠️ Refactor suggestion

Refactor duplicated panic recovery code in tracing methods.

The deferred panic recovery blocks in OnTxStart, OnTxEnd, OnEnter, OnExit, and OnLog are duplicated. Refactoring this code into a helper function will reduce duplication and improve maintainability.

You can create a helper function for panic recovery:

func recoverPanic(l zerolog.Logger, context string) {
	if r := recover(); r != nil {
		err, ok := r.(error)
		if !ok {
			err = fmt.Errorf("panic: %v", r)
		}
		l.Err(err).Stack().Msg(fmt.Sprintf("%s trace collection failed", context))
	}
}

Then, update the deferred functions in each method:

 wrapped.OnTxStart = func(vm *tracing.VMContext, tx *types.Transaction, from common.Address) {
-	defer func() {
-		if r := recover(); r != nil {
-			err, ok := r.(error)
-			if !ok {
-				err = fmt.Errorf("panic: %v", r)
-			}
-			l.Err(err).Stack().Msg("OnTxStart trace collection failed")
-		}
-	}()
+	defer recoverPanic(l, "OnTxStart")
	// existing code
}

wrapped.OnTxEnd = func(receipt *types.Receipt, err error) {
-	defer func() { /* ... */ }()
+	defer recoverPanic(l, "OnTxEnd")
	// existing code
}

// Repeat for OnEnter, OnExit, and OnLog

Also applies to: 114-127, 143-163, 165-178, 180-193


41-41: ⚠️ Potential issue

Ensure thread safety when accessing resultsByTxID map.

The resultsByTxID map is accessed and modified without synchronization. If CallTracerCollector is used concurrently, this could lead to data races and panics. Introduce synchronization to ensure thread safety.

Add a mutex to the struct:

 type CallTracerCollector struct {
 	tracer        *tracers.Tracer
+	mu            sync.Mutex
 	resultsByTxID map[common.Hash]json.RawMessage
 	logger        zerolog.Logger
 }

Modify the Collect method to use the mutex:

 func (ct *CallTracerCollector) Collect(txID common.Hash) (json.RawMessage, error) {
+	ct.mu.Lock()
+	defer ct.mu.Unlock()
 	// collect the trace result
 	result, found := ct.resultsByTxID[txID]
 	if !found {
 		return nil, fmt.Errorf("trace result for tx: %s, not found", txID.String())
 	}
 	// remove the result
 	delete(ct.resultsByTxID, txID)
 	return result, nil
 }

And in the OnTxEnd function within NewSafeTxTracer:

 // collect results for the tracer
 res, err := ct.tracer.GetResult()
 if err != nil {
 	l.Error().Err(err).Msg("failed to produce trace results")
 	return
 }
+ct.mu.Lock()
 ct.resultsByTxID[receipt.TxHash] = res
+ct.mu.Unlock()

 // reset tracing to have fresh state
 if err := ct.ResetTracer(); err != nil {
 	l.Error().Err(err).Msg("failed to reset tracer")
 	return
 }

Ensure you import the sync package:

import (
	// existing imports
	"sync"
)

This will protect concurrent access to resultsByTxID and prevent potential race conditions.

Also applies to: 73-84, 128-141

storage/pebble/receipts.go (2)

93-93: ⚠️ Potential issue

Concurrency concern: Accessing shared resources without synchronization in GetByTransactionID

The removal of mutex locks in the GetByTransactionID method may lead to data races when accessed concurrently. Accessing r.store.get without synchronization could cause inconsistent state or panics in a multi-threaded environment.

Consider reintroducing synchronization mechanisms or ensuring that the underlying r.store operations are thread-safe.


112-112: ⚠️ Potential issue

Concurrency concern: Unsynchronized access in GetByBlockHeight

The GetByBlockHeight method no longer uses mutex locks after the changes. This might introduce race conditions if the method is called concurrently by multiple goroutines.

Ensure that concurrent accesses to this method are safe, possibly by adding synchronization or verifying the thread safety of the underlying storage.

storage/pebble/register_storage.go (1)

66-68: 🛠️ Refactor suggestion

Avoid variable shadowing by renaming the recovered variable

In the deferred function, the variable r shadows the method receiver r, which can lead to confusion. Renaming the variable enhances code clarity.

Apply this diff to rename the variable:

	defer func() {
-		if r := recover(); r != nil {
-			err = fmt.Errorf("panic: %v", r)
+		if rec := recover(); rec != nil {
+			err = fmt.Errorf("panic: %v", rec)
		}
	}()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		if rec := recover(); rec != nil {
			err = fmt.Errorf("panic: %v", rec)
		}
services/ingestion/engine.go (1)

152-157: 🛠️ Refactor suggestion

Avoid using log.Fatal in defer; handle batch.Close error appropriately

Using e.log.Fatal inside a defer function may cause the application to exit unexpectedly. It's advisable to handle the error without terminating the application abruptly, especially within a defer statement.

Apply this diff to handle the error gracefully:

defer func(batch *pebbleDB.Batch) {
    if err := batch.Close(); err != nil {
-       e.log.Fatal().Err(err).Msg("failed to close batch")
+       e.log.Error().Err(err).Msg("failed to close batch")
+       // Consider propagating the error or handling it appropriately
    }
}(batch)

Committable suggestion skipped: line range outside the PR's diff.

api/debug.go (1)

436-445: ⚠️ Potential issue

Prevent nil pointer dereference in isDefaultCallTracer

In the isDefaultCallTracer function, you dereference *config.Tracer without checking if config.Tracer is nil. This could cause a nil pointer dereference if config.Tracer is nil.

Apply this diff to fix the issue:

 func isDefaultCallTracer(config *tracers.TraceConfig) bool {
     if config == nil {
         return false
     }

+    if config.Tracer == nil {
+        return false
+    }

     if *config.Tracer != replayer.TracerName {
         return false
     }

     tracerConfig := json.RawMessage(replayer.TracerConfig)
     return slices.Equal(config.TracerConfig, tracerConfig)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

	if config == nil {
		return false
	}

	if config.Tracer == nil {
		return false
	}

	if *config.Tracer != replayer.TracerName {
		return false
	}

	tracerConfig := json.RawMessage(replayer.TracerConfig)
	return slices.Equal(config.TracerConfig, tracerConfig)
services/requester/requester.go (1)

423-428: ⚠️ Potential issue

Prevent potential overflow in gas limit calculation

In the EstimateGas function, there is a possibility of integer overflow when calculating mid = failingGasLimit * 2. To ensure robustness, add a check to prevent overflow for large gas limit values.

Apply this diff to address the potential overflow:

 if mid > failingGasLimit*2 {
+    if failingGasLimit > math.MaxUint64/2 {
+        mid = passingGasLimit
+    } else {
    	mid = failingGasLimit * 2
+    }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		if mid > failingGasLimit*2 {
			// Most txs don't need much higher gas limit than their gas used, and most txs don't
			// require near the full block limit of gas, so the selection of where to bisect the
			// range here is skewed to favor the low side.
			if failingGasLimit > math.MaxUint64/2 {
				mid = passingGasLimit
			} else {
				mid = failingGasLimit * 2
			}
		}
services/ingestion/event_subscriber.go (4)

278-282: 🛠️ Refactor suggestion

Implement Exponential Backoff for Rate Limiting

Currently, when a ResourceExhausted error occurs, the code sleeps for 100 milliseconds before retrying. Implementing an exponential backoff strategy can be more effective in handling rate limits imposed by the access node.

Apply this change to implement exponential backoff:

- time.Sleep(100 * time.Millisecond)
+ // Implement exponential backoff with a maximum delay
+ maxBackoff := time.Second * 5
+ if retryCount < 10 {
+     backoffDuration := time.Duration(math.Pow(2, float64(retryCount))) * time.Millisecond
+     if backoffDuration > maxBackoff {
+         backoffDuration = maxBackoff
+     }
+     time.Sleep(backoffDuration)
+     retryCount++
+ } else {
+     return 0, fmt.Errorf("exceeded maximum retries due to rate limiting")
+ }

Ensure you initialize and increment retryCount appropriately within the retry loop.

Also applies to: 287-293, 362-368, 377-383


166-174: ⚠️ Potential issue

Potential Infinite Recovery Loop

In the subscribe method, if evmEvents.Err != nil or r.recovery is true, and after calling recover, r.recovery remains true, the loop continues without advancing. Ensure that there is logic to prevent an infinite loop in recovery mode.

Consider adding a maximum retry count or timeout mechanism to prevent indefinite looping. For example:

retryLimit := 5
retryCount := 0

for {
    // existing logic
    if r.recovery {
        retryCount++
        if retryCount > retryLimit {
            eventsChan <- models.NewBlockEventsError(fmt.Errorf("exceeded maximum recovery attempts"))
            return
        }
        continue
    }
    // reset retryCount if recovery is successful
    retryCount = 0
}

208-229: 🛠️ Refactor suggestion

Handle Context Cancellation in Backfill Loop

In the backfill method, the loop does not explicitly check for context cancellation, which could lead to unnecessary iterations after the context is canceled. Add context cancellation checks to exit promptly when the context is done.

Apply this change:

for {
    select {
    case <-ctx.Done():
        r.logger.Info().Msg("Backfill cancelled")
        return
    default:
        // Existing logic
    }
}

This ensures that the backfill process responds promptly to cancellation signals.


347-349: 🛠️ Refactor suggestion

Remove Unused Parameters in Function Signature

The parameters blockExecutedEventType and txExecutedEventType are passed to accumulateBlockEvents but are not used within the function. Remove unused parameters to clean up the code.

Apply this diff to remove the unused parameters:

-func (r *RPCEventSubscriber) accumulateBlockEvents(
-    ctx context.Context,
-    block flow.BlockEvents,
-    blockExecutedEventType string,
-    txExecutedEventType string,
-) (models.BlockEvents, error) {
+func (r *RPCEventSubscriber) accumulateBlockEvents(
+    ctx context.Context,
+    block flow.BlockEvents,
+) (models.BlockEvents, error) {

Also, ensure that any calls to this function are updated accordingly.

Committable suggestion skipped: line range outside the PR's diff.

services/ingestion/engine_test.go (1)

657-659: 🛠️ Refactor suggestion

Use assert.NoError instead of require.NoError inside deferred functions

Using require.NoError within a defer function can cause the test to fail after it has already completed, which can lead to confusing test results and make debugging more difficult. Replace require.NoError with assert.NoError to log any errors without abruptly failing the test.

Apply this diff to make the change:

     defer func() {
-        require.NoError(t, batch.Close())
+        assert.NoError(t, batch.Close())
     }()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

	defer func() {
		assert.NoError(t, batch.Close())
	}()
tests/web3js/debug_traces_test.js (1)

41-42: ⚠️ Potential issue

Use numbers instead of BigInt literals in assert.lengthOf

The assert.lengthOf function expects a number as the length. Using BigInt literals (9856n, 9806n) may lead to unexpected behavior. Consider changing them to regular numbers if the values are within the safe integer range.

Apply this diff to fix the issue:

-    assert.lengthOf(txTrace.input, 9856n)
-    assert.lengthOf(txTrace.output, 9806n)
+    assert.lengthOf(txTrace.input, 9856)
+    assert.lengthOf(txTrace.output, 9806)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    assert.lengthOf(txTrace.input, 9856)
    assert.lengthOf(txTrace.output, 9806)
storage/index_test.go (24)

406-409: ⚠️ Potential issue

Close batches after committing in TestBloomsForBlockRange

Ensure batches are closed after committing when storing receipts in the blooms test.

Apply this diff:

 batch := s.DB.NewBatch()
 err := s.ReceiptIndexer.Store([]*models.Receipt{r}, batch)
 s.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

			err := s.ReceiptIndexer.Store([]*models.Receipt{r}, batch)
			s.Require().NoError(err)

			err = batch.Commit(pebble2.Sync)
			s.Require().NoError(err)

			err = batch.Close()
			s.Require().NoError(err)

289-292: ⚠️ Potential issue

Close batches after commit in TestStoreReceipt

In the ReceiptTestSuite, ensure batches are closed after committing when storing receipts.

Apply this diff:

 batch := s.DB.NewBatch()
 err := s.ReceiptIndexer.Store([]*models.Receipt{receipt}, batch)
 s.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		err := s.ReceiptIndexer.Store([]*models.Receipt{receipt}, batch)
		s.Require().NoError(err)

		err = batch.Commit(pebble2.Sync)
		s.Require().NoError(err)

		err = batch.Close()
		s.Require().NoError(err)

97-101: ⚠️ Potential issue

Close batches after committing in TestGet

In the TestGet method of BlockTestSuite, the batch is not closed after committing. This could lead to resource leaks.

Apply this diff:

 batch := b.DB.NewBatch()

 err := b.Blocks.Store(height+1, flowID, block, batch)
 b.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 b.Require().NoError(err)

+ err = batch.Close()
+ b.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		err := b.Blocks.Store(height+1, flowID, block, batch)
		b.Require().NoError(err)

		err = batch.Commit(pebble2.Sync)
		b.Require().NoError(err)

		err = batch.Close()
		b.Require().NoError(err)

588-592: ⚠️ Potential issue

Close batches after committing in TransactionTestSuite

In TestStoreTransaction, ensure the batch is closed after committing.

Apply this diff:

 batch := s.DB.NewBatch()

 err := s.TransactionIndexer.Store(tx, batch)
 s.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.


		err := s.TransactionIndexer.Store(tx, batch)
		s.Require().NoError(err)

		err = batch.Commit(pebble2.Sync)
		s.Require().NoError(err)

		err = batch.Close()
		s.Require().NoError(err)

449-451: ⚠️ Potential issue

Close batches after committing when storing multiple receipts

In storing multiple receipts per block, close the batch after committing.

Apply this diff:

 batch := s.DB.NewBatch()
 s.Require().NoError(s.ReceiptIndexer.Store(receipts, batch))
 err := batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

			s.Require().NoError(s.ReceiptIndexer.Store(receipts, batch))
			err := batch.Commit(pebble2.Sync)
			s.Require().NoError(err)
			err = batch.Close()
			s.Require().NoError(err)

343-346: ⚠️ Potential issue

Close batches after committing when storing single receipt

Ensure batches are closed after committing in TestGetReceiptByTransactionID.

Apply this diff:

 batch := s.DB.NewBatch()
 err := s.ReceiptIndexer.Store([]*models.Receipt{receipt}, batch)
 s.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		err := s.ReceiptIndexer.Store([]*models.Receipt{receipt}, batch)
		s.Require().NoError(err)

		err = batch.Commit(pebble2.Sync)
		s.Require().NoError(err)

		err = batch.Close()
		s.Require().NoError(err)

134-141: ⚠️ Potential issue

Ensure batch closure after commit in TestStore

In TestStore, the batch is not closed after committing. Closing the batch is necessary to release resources.

Apply this diff:

 batch := b.DB.NewBatch()

 err := b.Blocks.Store(2, flowID, block, batch)
 b.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 b.Require().NoError(err)

+ err = batch.Close()
+ b.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		batch := b.DB.NewBatch()

		err := b.Blocks.Store(2, flowID, block, batch)
		b.Require().NoError(err)

		err = batch.Commit(pebble2.Sync)
		b.Require().NoError(err)

		err = batch.Close()
		b.Require().NoError(err)

501-504: ⚠️ Potential issue

Ensure batch closure in single height range tests

Close batches after committing in the single height range test case.

Apply this diff:

 batch := s.DB.NewBatch()
 s.Require().NoError(s.ReceiptIndexer.Store(receipts, batch))

 err := batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

			s.Require().NoError(s.ReceiptIndexer.Store(receipts, batch))

			err := batch.Commit(pebble2.Sync)
			s.Require().NoError(err)

			err = batch.Close()
			s.Require().NoError(err)

601-604: ⚠️ Potential issue

Close batches after committing when storing transactions

In TestGetTransaction, close the batch after committing.

Apply this diff:

 batch := s.DB.NewBatch()
 err := s.TransactionIndexer.Store(tx, batch)
 s.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		err := s.TransactionIndexer.Store(tx, batch)
		s.Require().NoError(err)

		err = batch.Commit(pebble2.Sync)
		s.Require().NoError(err)

		err = batch.Close()
		s.Require().NoError(err)

664-668: ⚠️ Potential issue

Close batches after committing in TraceTestSuite

In TestStore, ensure batches are closed after committing to prevent resource leaks.

Apply this diff:

 batch := s.DB.NewBatch()

 err := s.TraceIndexer.StoreTransaction(id, trace, batch)
 s.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		err := s.TraceIndexer.StoreTransaction(id, trace, batch)
		s.Require().NoError(err)

		err = batch.Commit(pebble2.Sync)
		s.Require().NoError(err)

		err = batch.Close()
		s.Require().NoError(err)

29-36: ⚠️ Potential issue

Ensure batches are closed after commit to prevent resource leaks

In TestBlocks, the batch created with db.NewBatch() is not closed after committing. Batches should be closed to release resources and prevent memory leaks.

Apply this diff to close the batch:

 err := bl.InitHeights(config.EmulatorInitCadenceHeight, flow.Identifier{0x1}, batch)
 require.NoError(t, err)

 err = batch.Commit(pebble2.Sync)
 require.NoError(t, err)

+ err = batch.Close()
+ require.NoError(t, err)

 suite.Run(t, &BlockTestSuite{
     Blocks: bl,
     DB:     db,
 })
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		batch := db.NewBatch()

		err := bl.InitHeights(config.EmulatorInitCadenceHeight, flow.Identifier{0x1}, batch)
		require.NoError(t, err)

		err = batch.Commit(pebble2.Sync)
		require.NoError(t, err)

		err = batch.Close()
		require.NoError(t, err)


262-266: ⚠️ Potential issue

Close batches after committing when retrieving Cadence IDs

In Cadence ID from EVM height test case, batches should be closed after committing.

Apply this diff:

 batch := b.DB.NewBatch()
 err := b.Blocks.Store(uint64(i), cadenceIDs[i], mocks.NewBlock(evmHeight), batch)
 b.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 b.Require().NoError(err)

+ err = batch.Close()
+ b.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

			batch := b.DB.NewBatch()
			err := b.Blocks.Store(uint64(i), cadenceIDs[i], mocks.NewBlock(evmHeight), batch)
			b.Require().NoError(err)

			err = batch.Commit(pebble2.Sync)
			b.Require().NoError(err)

			err = batch.Close()
			b.Require().NoError(err)

181-185: ⚠️ Potential issue

Close batches after commit in TestHeights

In TestHeights, batches are not closed after committing. This practice should be corrected to prevent resource leaks.

Apply this diff:

 batch := b.DB.NewBatch()

 err := b.Blocks.Store(lastHeight+10, flow.Identifier{byte(i)}, mocks.NewBlock(lastHeight), batch)
 b.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 b.Require().NoError(err)

+ err = batch.Close()
+ b.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

			err := b.Blocks.Store(lastHeight+10, flow.Identifier{byte(i)}, mocks.NewBlock(lastHeight), batch)
			b.Require().NoError(err)

			err = batch.Commit(pebble2.Sync)
			b.Require().NoError(err)

			err = batch.Close()
			b.Require().NoError(err)

226-230: ⚠️ Potential issue

Close batches after commit when updating latest Cadence height

In the last Cadence height test case, batches are not closed after committing.

Apply this diff:

 batch := b.DB.NewBatch()
 err := b.Blocks.Store(lastHeight, flow.Identifier{byte(i)}, mocks.NewBlock(lastHeight-10), batch)
 b.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 b.Require().NoError(err)

+ err = batch.Close()
+ b.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

			batch := b.DB.NewBatch()
			err := b.Blocks.Store(lastHeight, flow.Identifier{byte(i)}, mocks.NewBlock(lastHeight-10), batch)
			b.Require().NoError(err)

			err = batch.Commit(pebble2.Sync)
			b.Require().NoError(err)

			err = batch.Close()
			b.Require().NoError(err)

155-160: ⚠️ Potential issue

Close batches within loops to prevent resource leaks

In the loop within TestStore, batches are created and committed but not closed. Ensure that each batch is closed after committing.

Apply this diff:

 for i := 0; i < 10; i++ {
     batch := b.DB.NewBatch()

     err := b.Blocks.Store(uint64(i+5), flow.Identifier{byte(i)}, mocks.NewBlock(uint64(10+i)), batch)
     b.Require().NoError(err)

     err = batch.Commit(pebble2.Sync)
     b.Require().NoError(err)

+    err = batch.Close()
+    b.Require().NoError(err)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

			batch := b.DB.NewBatch()

			err := b.Blocks.Store(uint64(i+5), flow.Identifier{byte(i)}, mocks.NewBlock(uint64(10+i)), batch)
			b.Require().NoError(err)

			err = batch.Commit(pebble2.Sync)
			b.Require().NoError(err)

			err = batch.Close()
			b.Require().NoError(err)

205-210: ⚠️ Potential issue

Close batches after commit when storing blocks

Ensure batches are closed after committing in the get height by ID test case within TestHeights.

Apply this diff:

 batch := b.DB.NewBatch()

 err := b.Blocks.Store(uint64(i), cadenceIDs[i], blocks[i], batch)
 b.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 b.Require().NoError(err)

+ err = batch.Close()
+ b.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

			batch := b.DB.NewBatch()

			err := b.Blocks.Store(uint64(i), cadenceIDs[i], blocks[i], batch)
			b.Require().NoError(err)

			err = batch.Commit(pebble2.Sync)
			b.Require().NoError(err)

			err = batch.Close()
			b.Require().NoError(err)

243-247: ⚠️ Potential issue

Close batches after committing when mapping Cadence heights

In Cadence height from EVM height test case, ensure batches are closed after committing.

Apply this diff:

 batch := b.DB.NewBatch()
 err := b.Blocks.Store(cadenceHeights[i], flow.Identifier{byte(i)}, mocks.NewBlock(evmHeight), batch)
 b.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 b.Require().NoError(err)

+ err = batch.Close()
+ b.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

			batch := b.DB.NewBatch()
			err := b.Blocks.Store(cadenceHeights[i], flow.Identifier{byte(i)}, mocks.NewBlock(evmHeight), batch)
			b.Require().NoError(err)

			err = batch.Commit(pebble2.Sync)
			b.Require().NoError(err)

			err = batch.Close()
			b.Require().NoError(err)

313-316: ⚠️ Potential issue

Close batches after committing in multiple receipt storage

When storing multiple receipts, ensure the batch is closed after committing.

Apply this diff:

 batch := s.DB.NewBatch()
 err := s.ReceiptIndexer.Store(receipts, batch)
 s.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		err := s.ReceiptIndexer.Store(receipts, batch)
		s.Require().NoError(err)

		err = batch.Commit(pebble2.Sync)
		s.Require().NoError(err)

		err = batch.Close()
		s.Require().NoError(err)

677-682: ⚠️ Potential issue

Close batches within loops in TestStore

When overwriting existing traces in a loop, ensure each batch is closed after committing.

Apply this diff:

 for i := 0; i < 2; i++ {
     id := common.Hash{0x01}
     trace := json.RawMessage(`{ "test": "foo" }`)

     batch := s.DB.NewBatch()

     err := s.TraceIndexer.StoreTransaction(id, trace, batch)
     s.Require().NoError(err)

     err = batch.Commit(pebble2.Sync)
     s.Require().NoError(err)

+    err = batch.Close()
+    s.Require().NoError(err)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

			batch := s.DB.NewBatch()

			err := s.TraceIndexer.StoreTransaction(id, trace, batch)
			s.Require().NoError(err)

			err = batch.Commit(pebble2.Sync)
			s.Require().NoError(err)

			err = batch.Close()
			s.Require().NoError(err)

694-698: ⚠️ Potential issue

Close batches after committing when testing trace retrieval

In TestGet, ensure the batch is closed after committing.

Apply this diff:

 batch := s.DB.NewBatch()

 err := s.TraceIndexer.StoreTransaction(id, trace, batch)
 s.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.


		err := s.TraceIndexer.StoreTransaction(id, trace, batch)
		s.Require().NoError(err)

		err = batch.Commit(pebble2.Sync)
		s.Require().NoError(err)

		err = batch.Close()
		s.Require().NoError(err)

333-334: ⚠️ Potential issue

Close batches even on error during receipt storage

Even when an error occurs during receipt storage, the batch should be closed to prevent resource leaks.

Apply this diff:

 batch := s.DB.NewBatch()
 err := s.ReceiptIndexer.Store(receipts, batch)
 s.Require().EqualError(err, "can't store receipts for multiple heights")

+ err = batch.Close()
+ s.Require().NoError(err)

Committable suggestion skipped: line range outside the PR's diff.


48-59: ⚠️ Potential issue

Close batches after use to prevent resource leaks in TestReceipts

In TestReceipts, the batch created is not closed after committing. Ensure that the batch is closed to free resources.

Apply this diff:

 err = bl.InitHeights(config.EmulatorInitCadenceHeight, flow.Identifier{0x1}, batch)
 require.NoError(t, err)
 err = bl.Store(30, flow.Identifier{0x1}, mocks.NewBlock(10), batch)
 require.NoError(t, err)
 err = bl.Store(30, flow.Identifier{0x1}, mocks.NewBlock(300), batch)
 require.NoError(t, err)

 err = batch.Commit(pebble2.Sync)
 require.NoError(t, err)

+ err = batch.Close()
+ require.NoError(t, err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		batch := db.NewBatch()

		err := bl.InitHeights(config.EmulatorInitCadenceHeight, flow.Identifier{0x1}, batch)
		require.NoError(t, err)
		err = bl.Store(30, flow.Identifier{0x1}, mocks.NewBlock(10), batch) // update first and latest height
		require.NoError(t, err)
		err = bl.Store(30, flow.Identifier{0x1}, mocks.NewBlock(300), batch) // update latest
		require.NoError(t, err)

		err = batch.Commit(pebble2.Sync)
		require.NoError(t, err)

		err = batch.Close()
		require.NoError(t, err)

366-379: ⚠️ Potential issue

Close batches after committing when storing receipts within tests

In TestGetReceiptByBlockHeight, batches need to be closed after committing to prevent resource leaks.

Apply this diff:

 batch := s.DB.NewBatch()
 err := s.ReceiptIndexer.Store([]*models.Receipt{receipt}, batch)
 s.Require().NoError(err)

 err = batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)

 batch = s.DB.NewBatch()

 // add one more receipt that shouldn't be retrieved
 r := mocks.NewReceipt(4, common.HexToHash("0x2"))
 s.Require().NoError(s.ReceiptIndexer.Store([]*models.Receipt{r}, batch))

 err = batch.Commit(pebble2.Sync)
 s.Require().NoError(err)

+ err = batch.Close()
+ s.Require().NoError(err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

		err := s.ReceiptIndexer.Store([]*models.Receipt{receipt}, batch)
		s.Require().NoError(err)

		err = batch.Commit(pebble2.Sync)
		s.Require().NoError(err)

		err = batch.Close()
		s.Require().NoError(err)

		batch = s.DB.NewBatch()

		// add one more receipt that shouldn't be retrieved
		r := mocks.NewReceipt(4, common.HexToHash("0x2"))
		s.Require().NoError(s.ReceiptIndexer.Store([]*models.Receipt{r}, batch))

		err = batch.Commit(pebble2.Sync)
		s.Require().NoError(err)

		err = batch.Close()
		s.Require().NoError(err)

628-631: ⚠️ Potential issue

Ensure batch closure in transaction loop

When storing multiple transactions in a loop, close each batch after committing.

Apply this diff:

 for i := 0; i < 10; i++ {
     tx = mocks.NewTransaction(uint64(10 + i))
     batch := s.DB.NewBatch()
     err := s.TransactionIndexer.Store(tx, batch)
     s.Require().NoError(err)

     err = batch.Commit(pebble2.Sync)
     s.Require().NoError(err)

+    err = batch.Close()
+    s.Require().NoError(err)
 }

Committable suggestion skipped: line range outside the PR's diff.

api/api.go (1)

346-346: ⚠️ Potential issue

Check for potential nil transaction in block.

Ensure that the transaction retrieved is not nil before proceeding. Add necessary checks to handle cases where a transaction might not be found at the given index.

Apply this diff to handle nil transactions:

 if err != nil {
     return handleError[*ethTypes.Transaction](err, l, b.collector)
 }
+if tx == nil {
+    return nil, nil
+}

Committable suggestion skipped: line range outside the PR's diff.

bootstrap/bootstrap.go (5)

361-361: ⚠️ Potential issue

Potential Deadlock in StopMetricsServer

In the StopMetricsServer method, using <-b.metrics.Done() to wait for the metrics server to finish may cause a deadlock if the Done() channel is not properly closed or signaled. Ensure that the metrics server correctly closes or signals the Done() channel to prevent the application from hanging during shutdown.


396-404: 🛠️ Refactor suggestion

Enhance Error Handling in StopDB Method

In the StopDB method, the error returned by b.db.Close() is logged but not acted upon. Consider handling the error more robustly, such as implementing a retry mechanism or triggering alerts if the database fails to close properly. This will ensure resource cleanup and prevent potential data corruption.


485-569: ⚠️ Potential issue

Prevent Resource Leaks in setupStorage

In the setupStorage function, if an error occurs after opening db, the database may not be closed properly, leading to resource leaks.

Consider deferring the closure of db at the point of successful opening:

db, err := pebble.OpenDB(config.DatabaseDir)
if err != nil {
	return nil, nil, err
}
defer func() {
	if err != nil {
		db.Close()
	}
}()

This ensures that db is closed if an error occurs later in the function.


137-141: 🛠️ Refactor suggestion

Check for Nil callTracerCollector After Initialization

After initializing callTracerCollector, consider checking if it's nil before using it to prevent potential nil pointer dereferences.


523-526: ⚠️ Potential issue

Correct Typographical Error in Variable Name

There's a typo in the variable evmBlokcHeight; it should be evmBlockHeight. This could lead to confusion and errors in the code.

Apply this diff to fix the typo:

-	evmBlokcHeight := uint64(0)
+	evmBlockHeight := uint64(0)

Ensure all references to evmBlokcHeight are updated accordingly.

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor

@janezpodhostnik janezpodhostnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will loop forever until the AN is available. Is this ok?
could we put this in an grpc interceptor?

Looks good otherwise

Base automatically changed from feature/local-tx-reexecution to main December 19, 2024 09:12
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
bootstrap/bootstrap.go (2)

511-547: Consider adding jitter or exponential backoff to the retry logic.

The retryInterceptor effectively handles ResourceExhausted errors. However, a random delay or exponential backoff can mitigate thundering herd issues when multiple clients retry at once.

 func retryInterceptor(maxDuration, pauseDuration time.Duration) grpcOpts.UnaryClientInterceptor {
 	return func(
 		ctx context.Context,
 		method string,
 		req, reply interface{},
 		cc *grpcOpts.ClientConn,
 		invoker grpcOpts.UnaryInvoker,
 		opts ...grpcOpts.CallOption,
 	) error {
 		start := time.Now()
 		attempts := 0
 		for {
 			err := invoker(ctx, method, req, reply, cc, opts...)
 			if err == nil {
 				return nil
 			}
 			if status.Code(err) != codes.ResourceExhausted {
 				return err
 			}
 			attempts++
 			duration := time.Since(start)
 			if duration >= maxDuration {
 				return fmt.Errorf("request failed (attempts: %d, duration: %v): %w", attempts, duration, err)
 			}
-			select {
+			// Example: exponential backoff with jitter
+			backoff := pauseDuration << (attempts - 1)
+			jitter := time.Duration(rand.Int63n(int64(backoff / 2)))
+			delay := backoff + jitter
+			select {
 			case <-ctx.Done():
 				return ctx.Err()
-			case <-time.After(pauseDuration):
+			case <-time.After(delay):
 			}
 		}
 	}
 }

632-634: Resolve or remove the “TODO(JanezP)” comment.

The commented code suggests verifying the storage account owner. Either implement the check or remove the commented-out code to maintain clarity.

bootstrap/bootstrap_test.go (1)

14-16: Minor naming nit: rename expecterErr to expectedErr.

“expecterErr” is slightly unconventional. This avoids confusion and aligns better with typical naming in tests.

-	expecterErr := status.Error(codes.ResourceExhausted, "resource exhausted")
+	expectedErr := status.Error(codes.ResourceExhausted, "resource exhausted")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79a0de7 and 70a8358.

📒 Files selected for processing (2)
  • bootstrap/bootstrap.go (5 hunks)
  • bootstrap/bootstrap_test.go (1 hunks)
🔇 Additional comments (7)
bootstrap/bootstrap.go (4)

23-24: Imports appear to be used correctly.

The added gRPC packages codes and status are relevant for error handling in the interceptor logic below.


28-28: Import of metrics seems appropriate.

The metrics package is used in various parts of this file. No concerns here.


38-49: Consider verifying large gRPC message size limits.

DefaultMaxMessageSize is set to 1GB, which might incur high resource usage. If large messages aren’t strictly needed, using a smaller limit or adopting a streaming approach may be safer.


470-476: Good use of gRPC dial options and retry interceptor.

Ensuring MaxCallRecvMsgSize aligns with DefaultMaxMessageSize is consistent with your constants. Confirm there’s no unbounded memory usage or concurrency issues when handling very large messages.

bootstrap/bootstrap_test.go (3)

1-12: Imports look appropriate for test usage.

Using testify/assert is a common practice. The gRPC-related packages align with testing the retry interceptor.


18-56: Table-driven tests are comprehensive and clear.

The scenarios effectively capture success, retries, and timeouts. Watch out for potential time-based test flakiness on slower CI environments.


58-86: Use of parallel tests is good; confirm no shared state issues.

t.Parallel() speeds up test runs, but ensure no shared global resources are mutated in these tests. This setup appears safe.

}

for _, tc := range testCases {
tc := tc
Copy link
Contributor Author

@peterargue peterargue Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not including this was causing lint errors which is strange since the loop var update should be included since go1.22.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to the golangci-lint version that we use: https://github.com/onflow/flow-evm-gateway/blob/main/.github/workflows/ci.yml#L41. I tried with the latest ([email protected]), and it doesn't produce this lint error. I will open up a new PR to update it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #718 .

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
bootstrap/bootstrap_test.go (2)

14-17: Consider using constants for time values.

The hardcoded time values (100*time.Millisecond, 10*time.Millisecond) in the interceptor creation could make tests flaky in CI environments. Consider defining these as constants at the package level for better maintainability and clarity.

+const (
+    testMaxRetryDelay    = 100 * time.Millisecond
+    testInitialRetryDelay = 10 * time.Millisecond
+)

 func TestRetryInterceptor(t *testing.T) {
     expecterErr := status.Error(codes.ResourceExhausted, "resource exhausted")
-    interceptor := retryInterceptor(100*time.Millisecond, 10*time.Millisecond)
+    interceptor := retryInterceptor(testMaxRetryDelay, testInitialRetryDelay)

18-56: Consider adding more edge cases.

The current test cases cover basic scenarios well. Consider adding these additional test cases to improve coverage:

  1. Context cancellation during retries
  2. Non-ResourceExhausted errors (should not retry)
  3. Maximum retry delay reached (verify exponential backoff)

Example test case for context cancellation:

{
    name: "context cancelled during retry",
    invoker: func(callCount int) error {
        if callCount == 2 {
            return context.Canceled
        }
        return expecterErr
    },
    maxRequestTime: 40 * time.Millisecond,
    callCount:     2,
    expectedErr:   context.Canceled,
},
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 70a8358 and 737c88f.

📒 Files selected for processing (1)
  • bootstrap/bootstrap_test.go (1 hunks)
🔇 Additional comments (2)
bootstrap/bootstrap_test.go (2)

1-12: LGTM! Well-organized imports.

The package declaration and imports follow Go best practices with proper grouping of standard library and external dependencies.


1-88: Verify test coverage aligns with PR objectives.

The tests focus on retry behavior for ResourceExhausted errors, which aligns with the PR's objective of handling resource exhaustion during ingestion. However, let's verify the coverage of the actual ingestion scenarios.

✅ Verification successful

Test coverage adequately verifies retry mechanism for resource exhaustion

The test coverage aligns well with the PR objectives. Here's why:

  1. The tests specifically verify the retryInterceptor which is the core mechanism for handling resource exhaustion:

    • Tests successful retry after multiple attempts
    • Verifies timeout behavior when exhaustion persists
    • Confirms immediate success cases
    • Validates exact retry counts and timing constraints
  2. The implementation shows this interceptor is properly integrated:

    • Used in the gRPC client setup with appropriate default values (DefaultResourceExhaustedMaxRetryDelay and DefaultResourceExhaustedRetryDelay)
    • Applied to all gRPC calls through the access node client, which handles ingestion
  3. The test cases cover all critical scenarios:

    • Immediate success
    • Success after retries
    • Failure after timeout
    • Proper error propagation
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Find all files related to ingestion that might be affected
rg -l "ResourceExhausted" .

# Check test coverage for those files
go test -coverprofile=coverage.out ./...
go tool cover -func=coverage.out | grep -i "ingestion\|bootstrap"

Length of output: 379


Script:

#!/bin/bash
# Let's examine the implementation of the retry interceptor and its usage
cat ./bootstrap/bootstrap.go

# Find any other files that might be using this interceptor
rg -l "retryInterceptor" .

# Look for ingestion-related code that might be using this
rg "ResourceExhausted" -C 5

Length of output: 21806

Comment on lines +58 to +88
for _, tc := range testCases {
tc := tc
t.Run(tc.name, func(t *testing.T) {
t.Parallel()

callCount := 0
invoker := func(context.Context, string, any, any, *grpc.ClientConn, ...grpc.CallOption) error {
callCount++
return tc.invoker(callCount)
}

start := time.Now()
err := interceptor(
context.Background(), "", nil, nil, nil,
invoker,
)
if tc.expectedErr != nil {
assert.ErrorIs(t, err, tc.expectedErr)
} else {
assert.NoError(t, err)
}

if tc.minCallCount > 0 {
assert.GreaterOrEqual(t, callCount, tc.minCallCount)
} else {
assert.Equal(t, callCount, tc.callCount)
}
assert.LessOrEqual(t, time.Since(start), tc.maxRequestTime)
})
}
}
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix potential race condition in parallel tests.

Since tests are running in parallel (t.Parallel()), the callCount variable should be protected against concurrent access. While this might not cause issues in the current implementation since each test case has its own callCount, it's better to be explicit about thread safety.

 t.Run(tc.name, func(t *testing.T) {
     t.Parallel()
 
-    callCount := 0
+    var callCount atomic.Int32
     invoker := func(context.Context, string, any, any, *grpc.ClientConn, ...grpc.CallOption) error {
-        callCount++
+        currentCount := callCount.Add(1)
-        return tc.invoker(callCount)
+        return tc.invoker(int(currentCount))
     }

Don't forget to add "sync/atomic" to your imports.

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really needed? It looks like the the variable is scoped to a single test run

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterargue, you’re correct that each subtest has its own callCount scoped to that test’s goroutine, so there’s little risk of a race condition here. Since callCount isn’t shared across multiple subtests or invoked concurrently outside the test function, using an atomic variable isn’t strictly necessary. Keeping it as an integer is sufficient if you’re confident there’s no other concurrent access to callCount.

Copy link
Contributor

@janezpodhostnik janezpodhostnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Collaborator

@m-Peter m-Peter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome 👏

@peterargue peterargue merged commit a9c1289 into main Jan 13, 2025
2 checks passed
@peterargue peterargue deleted the petera/handle-resourceexhaused-ingestion branch January 13, 2025 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants