Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: simplify collected metrics #21963

Merged
merged 3 commits into from
Sep 28, 2024
Merged

refactor: simplify collected metrics #21963

merged 3 commits into from
Sep 28, 2024

Conversation

julienrbrt
Copy link
Member

@julienrbrt julienrbrt commented Sep 27, 2024

Description

Closes: #21614


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title, you can find examples of the prefixes below:
  • confirmed ! in the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • reviewed "Files changed" and left comments if necessary
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • updated the relevant documentation or specification, including comments for documenting Go code
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

Please see Pull Request Reviewer section in the contributing guide for more information on how to review a pull request.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

Summary by CodeRabbit

  • New Features

    • Introduced the sims2 framework and factory methods for improved message handling.
    • Added telemetry functionality to track execution time in various methods.
  • Bug Fixes

    • Implemented a liveness matrix for tracking validator sign status.
    • Modified simulation tests to skip when there are insufficient validators.
    • Updated CLI to query address-by-account-number using account_id.
  • Documentation

    • Updated telemetry documentation to reflect changes in API endpoints and improved examples.
  • Chores

    • Adjusted dependency annotations for github.com/hashicorp/go-metrics to reflect indirect usage.

@julienrbrt julienrbrt added the backport/v0.52.x PR scheduled for inclusion in the v0.52's next stable release label Sep 27, 2024
Copy link
Contributor

coderabbitai bot commented Sep 27, 2024

📝 Walkthrough

Walkthrough

The pull request introduces several updates, including improvements to the telemetry system, the removal of redundant telemetry code in message handlers, and adjustments to dependency declarations. Key changes include the addition of the sims2 framework, modifications to telemetry measurement logic, and the elimination of unnecessary telemetry tracking in various modules. These adjustments aim to enhance code clarity and reduce overhead without altering the core functionality of the system.

Changes

Files Change Summary
CHANGELOG.md Added updates for the sims2 framework, bug fixes, and telemetry improvements.
docs/learn/advanced/09-telemetry.md Updated API endpoint for metrics, example code for metric emission, reformatted metrics section, and added idempotency explanation.
x/bank/go.mod, x/distribution/go.mod, x/staking/go.mod Updated dependency for github.com/hashicorp/go-metrics to indicate it as an indirect dependency.
x/bank/keeper/msg_server.go, x/bank/v2/keeper/handlers.go, x/distribution/keeper/msg_server.go, x/staking/keeper/msg_server.go Removed telemetry-related code from message handling methods, streamlining functionality.
x/distribution/keeper/abci.go, x/epochs/keeper/abci.go, x/evidence/keeper/abci.go, x/gov/keeper/abci.go, x/mint/keeper/abci.go, x/protocolpool/keeper/keeper.go, x/slashing/abci.go, x/upgrade/keeper/abci.go Modified telemetry measurement logic to store the start time in a variable before deferring measurement calls.
x/group/keeper/abci.go Added telemetry functionality to the EndBlocker method, capturing execution time.

Assessment against linked issues

Objective Addressed Explanation
Ensure telemetry in msg handlers does not run multiple times leading to unexpected metric values (#21614)
Remove redundant telemetry tracking in message handlers (#21614)

Possibly related PRs

Suggested labels

C:server/v2, C:collections, C:orm

Suggested reviewers

  • julienrbrt
  • lucaslopezf
  • aaronc
  • kocubinski
  • testinginprod
  • facundomedica
  • sontrinh16
  • tac0turtle

Warning

Review ran into problems

🔥 Problems

Error running LanguageTool: LanguageTool error: Unknown error


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (18)
x/group/keeper/abci.go (1)

14-16: LGTM: Telemetry measurement added correctly.

The addition of telemetry measurement to the EndBlocker function is well-implemented and aligns with the PR objectives. The use of defer ensures that the measurement is always recorded, even if the function returns early.

Consider adding a brief comment explaining the purpose of the telemetry measurement for better code documentation. For example:

// Measure the execution time of EndBlocker for performance monitoring
start := telemetry.Now()
defer telemetry.ModuleMeasureSince(types.ModuleName, start, telemetry.MetricKeyEndBlocker)
x/distribution/keeper/abci.go (1)

Line range hint 23-24: Consider addressing the Tendermint dependency in a future PR

There's a TODO comment mentioning a Tendermint dependency:

// TODO this is Tendermint-dependent
// ref https://github.com/cosmos/cosmos-sdk/issues/3095

While it's not directly related to the current changes, it might be worth addressing this dependency in a future PR to improve the modularity of the code.

Consider creating a new issue to track this TODO and decouple the code from Tendermint-specific implementations.

x/evidence/keeper/abci.go (1)

16-17: Approved: Improved telemetry timing accuracy

The change improves the accuracy of telemetry timing by capturing the start time earlier and storing it in a variable. This aligns with the PR objective of simplifying collected metrics and provides a more precise measurement of the BeginBlocker execution time.

For consistency with other parts of the codebase, consider using metrics instead of telemetry as the package alias:

-import "github.com/cosmos/cosmos-sdk/telemetry"
+import metrics "github.com/cosmos/cosmos-sdk/telemetry"

 // In the function body:
-start := telemetry.Now()
-defer telemetry.ModuleMeasureSince(types.ModuleName, start, telemetry.MetricKeyBeginBlocker)
+start := metrics.Now()
+defer metrics.ModuleMeasureSince(types.ModuleName, start, metrics.MetricKeyBeginBlocker)

This change would make the code more consistent with other parts of the Cosmos SDK that use the metrics alias for the telemetry package.

telemetry/wrapper.go (1)

Line range hint 1-113: Consider addressing the multiple increment issue.

While the simplification of metrics is a step in the right direction, the core issue of metrics being incremented multiple times during different transaction phases (as mentioned in issue #21614) is not directly addressed in this file.

Consider implementing a mechanism to differentiate between checkTx and block finalization phases in the relevant message handler files. This could involve:

  1. Adding a context parameter to indicate the current execution phase.
  2. Implementing conditional metric increments based on the execution phase.

Example pseudo-code:

func handleMessage(ctx sdk.Context, msg sdk.Msg) {
    if !ctx.IsCheckTx() {
        // Only increment metrics during non-checkTx phase
        IncrCounter(1, "new_account")
    }
    // Rest of the message handling logic
}
x/epochs/keeper/abci.go (1)

Line range hint 1-93: Consider addressing the multiple execution issue

While this change improves the accuracy of telemetry measurements for the BeginBlocker function, it doesn't directly address the issue mentioned in the linked issue #21614 regarding multiple executions during checkTx and block finalization.

To fully resolve the telemetry issues:

  1. Consider adding logic to differentiate between checkTx and block finalization executions.
  2. Implement a mechanism to prevent duplicate metric increments for the same logical operation.

This would ensure that metrics like new_account are not incremented multiple times for a single transaction.

x/upgrade/keeper/abci.go (1)

22-23: Approved: Telemetry timing improvement

The change improves the accuracy of telemetry measurements by capturing the start time before the defer statement. This aligns with the PR objective of simplifying collected metrics and potentially addresses the issue of telemetry running multiple times.

For even better clarity, consider adding a brief comment explaining the purpose of capturing the start time separately:

// Capture start time before defer to ensure accurate telemetry
start := telemetry.Now()
defer telemetry.ModuleMeasureSince(types.ModuleName, start, telemetry.MetricKeyBeginBlocker)
docs/learn/advanced/09-telemetry.md (4)

26-27: LGTM: Improved clarity in metric measurement

The use of telemetry.Now() for capturing the start time is a good improvement. It's more idiomatic and consistent with the telemetry package.

Consider adding a brief comment explaining the purpose of this timing measurement, e.g.:

// Measure the duration of the EndBlocker execution
start := telemetry.Now()
defer telemetry.ModuleMeasureSince(types.ModuleName, start, telemetry.MetricKeyEndBlocker)

79-83: LGTM: Valuable information on metric idempotency added

The new section on idempotency is a crucial addition. It directly addresses the issue mentioned in the linked PR objective about metrics being incremented multiple times. The explanation is clear and includes relevant examples.

Consider adding a brief recommendation on how to handle this non-idempotency, e.g.:

To mitigate this, consider implementing checks or flags to ensure metrics are only emitted once per logical operation, regardless of how many times the code is executed.

86-101: LGTM: Improved readability and simplified metrics list

The reformatted table of supported metrics is a significant improvement in terms of readability. The simplification of the metrics list aligns well with the PR objective of simplifying collected metrics while retaining essential ones.

Consider adding a brief introductory sentence before the table to provide context, e.g.:

The following table lists the key metrics supported by the Cosmos SDK telemetry package:

Line range hint 1-101: LGTM: Well-structured and comprehensive documentation

The overall structure and flow of the document have been significantly improved. The logical progression from enabling telemetry to emitting metrics, followed by important considerations and a list of supported metrics, makes the document very informative and easy to follow.

Consider adding a table of contents at the beginning of the document to help readers quickly navigate to specific sections, especially now that new sections like "Idempotency" have been added.

x/gov/keeper/abci.go (1)

Line range hint 22-324: Consider refactoring the EndBlocker function for improved maintainability

While the changes made are good, there are some potential improvements that could be made to the overall structure of the EndBlocker function:

  1. The function is quite long and handles multiple responsibilities. Consider splitting it into smaller, more focused functions. This would improve readability and maintainability.

  2. There are several nested if statements that could be simplified or extracted into separate functions to improve readability. For example, the logic for handling inactive proposals and active proposals could be moved to separate functions.

  3. Error handling is consistent, but there are some places where errors are logged and ignored (e.g., lines 95-97, 220-222). Consider reviewing these instances to ensure they don't hide important issues.

  4. The function contains a large switch statement (lines 186-240) that could potentially be refactored into a separate function or using a strategy pattern to handle different proposal outcomes.

These suggestions are not directly related to the current changes but could improve the overall code quality and maintainability in future iterations.

x/protocolpool/keeper/keeper.go (1)

510-514: LGTM! Consider adding error handling.

The addition of telemetry to measure the execution time of the BeginBlocker method is well-implemented. It uses the correct module name and metric key, and the defer statement ensures the measurement is always recorded.

Consider adding error handling to log any errors that occur during execution:

 func (k Keeper) BeginBlocker(ctx context.Context) error {
 	start := telemetry.Now()
-	defer telemetry.ModuleMeasureSince(types.ModuleName, start, telemetry.MetricKeyBeginBlocker)
+	defer func() {
+		telemetry.ModuleMeasureSince(types.ModuleName, start, telemetry.MetricKeyBeginBlocker)
+	}()
 
-	return k.SetToDistribute(ctx)
+	err := k.SetToDistribute(ctx)
+	if err != nil {
+		k.Logger.Error("error in BeginBlocker", "error", err)
+	}
+	return err
 }

This change will ensure that any errors are logged before being returned, which can be helpful for debugging and monitoring.

CHANGELOG.md (6)

Line range hint 1-12: Overall structure and content of the changelog is well-organized

The changelog provides a comprehensive overview of changes, improvements, and bug fixes for each version of the Cosmos SDK. The organization into sections such as Features, Improvements, Bug Fixes, and API Breaking Changes makes it easy for developers to find relevant information.

However, consider the following suggestions for improvement:

  1. Add a brief summary at the top of each version's changes, highlighting the most significant updates.
  2. Use consistent formatting for version numbers and dates across all entries.
  3. Consider grouping changes by module or component for easier navigation, especially for larger releases.

Line range hint 14-43: Feature descriptions could be more detailed for clarity

While the Features section provides a good overview of new additions, some entries could benefit from more detailed explanations or examples. For instance:

  1. Line 17: The description of the new debug pubkey-raw command could include a brief example of its usage.
  2. Line 22: The "Introduce new protobuf based PubKeys" entry could explain the benefits of this change for developers.
  3. Line 30: The "Add GenericFilteredPaginate to the query package" entry could benefit from a brief explanation of how this improves UX.

Consider adding more context or examples to these and similar entries to help developers better understand the impact and usage of these new features.


Line range hint 45-85: Improvements section could benefit from more context

The Improvements section covers a wide range of changes, but some entries could be more informative:

  1. Consider grouping related improvements together (e.g., all baseapp-related changes).
  2. For performance improvements, it would be helpful to include quantitative data where possible (e.g., "Improve performance of X by Y%").
  3. For technical changes like line 66 ("Use sdk.Dec instead of int64 for Priority field"), briefly explain the rationale or benefits of the change.

Adding this context would help developers better understand the significance of these improvements and how they might affect their applications.


Line range hint 87-103: Bug Fixes section could provide more context on impact

The Bug Fixes section addresses various issues, but could be more informative:

  1. Consider categorizing bugs by severity (e.g., critical, major, minor) to help developers prioritize updates.
  2. For security-related fixes (like line 89), provide a brief explanation of the potential impact of the bug and the importance of updating.
  3. Where applicable, include references to GitHub issues or pull requests for developers who want more detailed information about a specific fix.

These additions would help developers better understand the importance of each fix and make informed decisions about updating their applications.


Line range hint 105-170: API Breaking Changes section could be more developer-friendly

The API Breaking Changes section contains critical information for developers, but its presentation could be improved:

  1. Consider grouping changes by module or component to make it easier for developers to find relevant information.
  2. For each breaking change, provide a brief migration guide or link to more detailed documentation on how to update affected code.
  3. Use a consistent format for each entry, perhaps including "Old behavior", "New behavior", and "Required action" subsections.
  4. Consider adding a severity or impact level for each breaking change to help developers prioritize their migration efforts.

Implementing these suggestions would make the API Breaking Changes section more accessible and actionable for developers updating their applications.


Line range hint 1-170: Overall, the changelog is informative but could be more user-friendly

The CHANGELOG.md file for Cosmos SDK v0.47.0 provides a comprehensive overview of the changes, improvements, and bug fixes in this release. However, there are several ways to enhance its usefulness for developers:

  1. Add a high-level summary at the beginning of each version's changes, highlighting the most significant updates.
  2. Use consistent formatting and structure across all sections.
  3. Provide more context and examples for complex changes or new features.
  4. Group related changes together within each section for easier navigation.
  5. Include severity or impact levels for bug fixes and breaking changes.
  6. Offer brief migration guides or links to documentation for breaking changes.

Implementing these suggestions would make the changelog more accessible and valuable to developers working with the Cosmos SDK.

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 7fe95fc and e57ea4a.

📒 Files selected for processing (20)
  • CHANGELOG.md (1 hunks)
  • docs/learn/advanced/09-telemetry.md (2 hunks)
  • telemetry/wrapper.go (1 hunks)
  • x/bank/go.mod (1 hunks)
  • x/bank/keeper/msg_server.go (0 hunks)
  • x/bank/v2/keeper/handlers.go (0 hunks)
  • x/distribution/go.mod (1 hunks)
  • x/distribution/keeper/abci.go (1 hunks)
  • x/distribution/keeper/msg_server.go (0 hunks)
  • x/epochs/keeper/abci.go (1 hunks)
  • x/evidence/keeper/abci.go (1 hunks)
  • x/gov/keeper/abci.go (1 hunks)
  • x/group/keeper/abci.go (1 hunks)
  • x/mint/keeper/abci.go (1 hunks)
  • x/protocolpool/keeper/keeper.go (2 hunks)
  • x/slashing/abci.go (1 hunks)
  • x/staking/go.mod (1 hunks)
  • x/staking/keeper/abci.go (1 hunks)
  • x/staking/keeper/msg_server.go (0 hunks)
  • x/upgrade/keeper/abci.go (1 hunks)
💤 Files not reviewed due to no reviewable changes (4)
  • x/bank/keeper/msg_server.go
  • x/bank/v2/keeper/handlers.go
  • x/distribution/keeper/msg_server.go
  • x/staking/keeper/msg_server.go
✅ Files skipped from review due to trivial changes (2)
  • x/staking/go.mod
  • x/staking/keeper/abci.go
🧰 Additional context used
📓 Path-based instructions (12)
CHANGELOG.md (1)

Pattern **/*.md: "Assess the documentation for misspellings, grammatical errors, missing documentation and correctness"

docs/learn/advanced/09-telemetry.md (1)

Pattern **/*.md: "Assess the documentation for misspellings, grammatical errors, missing documentation and correctness"

telemetry/wrapper.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/distribution/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/epochs/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/evidence/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/gov/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/group/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/mint/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/protocolpool/keeper/keeper.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/slashing/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/upgrade/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

🔇 Additional comments (14)
x/group/keeper/abci.go (2)

5-8: LGTM: Import statements are correctly added and organized.

The new import statements for gov/types and telemetry are properly placed and follow the Uber Golang style guide recommendations. They are necessary for the new telemetry functionality in the EndBlocker function.


Line range hint 1-23: Overall changes look good, but additional work may be needed.

The addition of telemetry measurement to the EndBlocker function is a positive step towards improving monitoring capabilities. However, it's important to note that this change alone doesn't fully address the issue described in #21614 regarding multiple executions of message handlers during checkTx and block finalization phases.

To ensure we're addressing the core issue, please confirm:

  1. Are there plans to implement similar telemetry in message handlers?
  2. How will this change help differentiate between executions in different transaction phases?

Consider running the following script to check for telemetry usage in message handlers:

This will help verify if telemetry is being consistently applied across relevant parts of the codebase.

x/mint/keeper/abci.go (1)

13-14: Improved accuracy of telemetry measurement

The changes improve the accuracy of the duration measurement for the BeginBlocker function. By capturing the start time in a separate variable before the defer statement, we ensure that the exact start time is used for the duration calculation. This aligns well with the PR objective of simplifying and improving the collected metrics.

The code adheres to the Uber Golang style guide, using clear variable names and proper formatting.

x/slashing/abci.go (1)

16-17: Improved telemetry timing, but further refinement needed.

The change improves the accuracy of telemetry timing by capturing the start time before the defer statement. This is a step in the right direction for addressing the telemetry issues mentioned in the PR objectives.

However, this change doesn't fully address the core issue of telemetry metrics being incremented multiple times due to execution during both checkTx and block finalization phases. Further investigation and changes may be needed to fully resolve this problem.

Consider adding a comment explaining the timing mechanism for better code maintainability:

+	// Capture start time for accurate telemetry measurement
 	start := telemetry.Now()
 	defer telemetry.ModuleMeasureSince(types.ModuleName, start, telemetry.MetricKeyBeginBlocker)

To ensure we're not missing any other telemetry-related code that might need similar adjustments, let's search for other uses of telemetry.ModuleMeasureSince:

This will help us identify if similar changes are needed elsewhere in the codebase.

x/distribution/keeper/abci.go (1)

14-15: Improved telemetry measurement accuracy

The introduction of the start variable and its use in the deferred telemetry.ModuleMeasureSince call is a good improvement. This change ensures that the start time for the telemetry measurement is captured accurately at the beginning of the function execution.

Benefits:

  1. More precise timing measurements for the BeginBlocker function.
  2. Addresses the issue of potential multiple telemetry runs by explicitly defining the start time.
  3. Improves the overall accuracy of the collected metrics.

The change aligns well with the PR objective of simplifying and improving telemetry metrics.

telemetry/wrapper.go (2)

Line range hint 1-113: LGTM: Overall changes improve metric simplicity.

The changes in this file successfully simplify the metric collection by removing unnecessary constants. The code maintains good structure and documentation, adhering to the Golang style guide. These modifications align well with the PR objectives of simplifying collected metrics.


11-13: LGTM: Constant declarations simplified.

The removal of MetricKeyPrepareCheckStater and MetricKeyPrecommiter constants aligns with the PR objective of simplifying collected metrics. The remaining constants are still relevant and used in the file.

To ensure that the removed constants are not used elsewhere in the codebase, please run the following verification script:

If the script returns any results, those occurrences should be addressed.

x/epochs/keeper/abci.go (1)

14-15: Improved telemetry measurement accuracy

This change enhances the accuracy of telemetry measurements by capturing the start time at the beginning of the function execution. It addresses the issue mentioned in the PR objectives by ensuring that the time measurement is more precise.

The use of a descriptive variable name start and the deferred call to telemetry.ModuleMeasureSince align well with Go best practices and the Uber Golang style guide.

x/upgrade/keeper/abci.go (1)

Line range hint 1-114: Verify if this addresses the multiple telemetry executions issue

While the changes improve the telemetry timing accuracy, it's important to verify if this fully addresses the issue described in #21614 regarding multiple executions of telemetry in message handlers.

To ensure this change resolves the issue:

  1. Confirm that this PreBlocker function is called only once per block.
  2. Verify that no other telemetry measurements in this function or called functions could lead to multiple increments.

If additional changes are needed to fully resolve the issue, please update the PR description to clarify the scope of the changes.

✅ Verification successful
# No further actions needed.

Verification Successful: Multiple telemetry executions are not present in the PreBlocker function.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other telemetry calls in this file and related files
rg --type go -i 'telemetry' x/upgrade

Length of output: 709


Script:

#!/bin/bash
# Find all calls to PreBlocker in the codebase
rg --type go 'PreBlocker' .

# Check all telemetry.SetGaugeWithLabels calls for potential multiple executions
rg --type go 'telemetry.SetGaugeWithLabels' .

Length of output: 10606

docs/learn/advanced/09-telemetry.md (2)

16-16: LGTM: Clear and helpful version information added

The addition of the v2 endpoint information (port 1318) is a valuable update. It helps users of different versions to correctly access the metrics endpoint.


Line range hint 1-101: LGTM: Well-written and error-free documentation

The document is well-written and free of obvious spelling or grammatical errors. The language used is clear, concise, and appropriate for technical documentation.

x/bank/go.mod (1)

22-22: LGTM. Verify removal of direct usage of go-metrics in the bank module.

The change to mark github.com/hashicorp/go-metrics as an indirect dependency aligns with the PR objective of simplifying collected metrics. This suggests that direct usage of this metrics library has been removed from the bank module, which is a positive step.

To ensure this change is correct and complete, please run the following command to verify that there are no direct imports of go-metrics in the bank module:

If this command returns no results, it confirms that all direct usage has been removed as expected.

x/distribution/go.mod (1)

94-94: LGTM. Verify the impact on metrics collection.

The change from a direct to an indirect dependency for github.com/hashicorp/go-metrics aligns with the PR objective of simplifying collected metrics. This suggests that the module no longer directly uses this package, which is a step towards simplification.

To ensure this change doesn't negatively impact the module's functionality, please run the following verification script:

x/gov/keeper/abci.go (1)

23-24: Improved telemetry measurement implementation

The changes to the telemetry measurement are well-implemented and address the issue of multiple measurements. By capturing the start time once at the beginning of the function, we ensure more accurate duration measurements for the EndBlocker execution.

This change aligns with the PR objectives to simplify and correct the collected metrics, specifically addressing the issue of telemetry in message handlers running multiple times (as mentioned in issue #21614).

The implementation follows the Uber Golang style guide and is clean and concise.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (13)
x/group/keeper/abci.go (1)

13-16: LGTM: Telemetry measurement added correctly.

The addition of telemetry measurement to the EndBlocker method aligns well with the PR objectives to simplify collected metrics. The use of defer ensures that the duration is always measured, which is a good practice.

However, consider wrapping the return statement in a named return value to ensure the telemetry is recorded even when an error occurs:

func (k Keeper) EndBlocker(ctx context.Context) (err error) {
	start := telemetry.Now()
	defer telemetry.ModuleMeasureSince(types.ModuleName, start, telemetry.MetricKeyEndBlocker)

	if err = k.TallyProposalsAtVPEnd(ctx); err != nil {
		return
	}

	return k.PruneProposals(ctx)
}

This change ensures that the telemetry is always recorded, even when an error occurs in TallyProposalsAtVPEnd or PruneProposals.

x/slashing/abci.go (1)

16-17: Approve change with a minor suggestion for improvement

The modification improves the accuracy of telemetry measurements by capturing the start time before setting up the deferred function. This ensures consistent timing even if there's a delay in executing the deferred function.

Consider renaming the start variable to startTime for improved clarity:

-	start := telemetry.Now()
-	defer telemetry.ModuleMeasureSince(types.ModuleName, start, telemetry.MetricKeyBeginBlocker)
+	startTime := telemetry.Now()
+	defer telemetry.ModuleMeasureSince(types.ModuleName, startTime, telemetry.MetricKeyBeginBlocker)

This small change enhances code readability by making the variable's purpose more explicit.

x/distribution/keeper/abci.go (1)

Line range hint 1-43: Consider refactoring for improved modularity and address TODO

While the changes look good, I have a few suggestions for future improvements:

  1. The function seems to have multiple responsibilities (proposer setting, reward distribution, community pool updates). Consider refactoring it into smaller, more focused functions for better separation of concerns.

  2. There's a TODO comment mentioning Tendermint dependency. It would be helpful to clarify the plan for addressing this dependency and potentially create an issue to track this if it hasn't been done already.

  3. The community pool update occurs every 1000 blocks. Consider making this interval configurable or explaining why this specific interval was chosen.

These suggestions are not directly related to the current changes but could improve the overall code quality and maintainability in the future.

telemetry/wrapper.go (1)

Line range hint 16-106: Consider refactoring telemetry functions for improved accuracy.

While the functions in this file remain unchanged, considering the PR objectives and the linked issue #21614, it might be beneficial to review and potentially refactor these functions to address the problem of multiple executions during different transaction phases.

Consider the following improvements:

  1. Add a parameter to distinguish between checkTx and block finalization phases in relevant functions.
  2. Implement a mechanism to track and prevent duplicate metric increments across different transaction simulation modes.

For example, you could modify the ModuleMeasureSince function like this:

func ModuleMeasureSince(module string, start time.Time, phase string, keys ...string) {
    if !IsTelemetryEnabled() {
        return
    }

    labels := append(
        []metrics.Label{
            NewLabel(MetricLabelNameModule, module),
            NewLabel("phase", phase),
        },
        globalLabels...,
    )

    metrics.MeasureSinceWithLabels(keys, start.UTC(), labels)
}

This change would allow you to distinguish between different execution phases and potentially filter out duplicate measurements.

x/upgrade/keeper/abci.go (1)

Line range hint 1-124: Consider refactoring for improved readability and maintainability

While the changes made are good, the PreBlocker function is quite complex and long. Consider the following suggestions to improve its overall structure:

  1. Break down the function into smaller, more focused helper functions. This could improve readability and make the code easier to maintain.
  2. Use error wrapping consistently throughout the function to provide more context when returning errors.
  3. Consider using a switch statement instead of multiple if-else blocks for handling different scenarios.

These changes could make the code more aligned with the Uber Golang style guide's recommendations for function length and complexity.

docs/learn/advanced/09-telemetry.md (4)

16-16: Approved with a suggestion for clarity

The addition of information about the v2 port is helpful. To improve clarity, consider specifying which version uses which port.

Consider rephrasing as:

To query active metrics (see retention note above) you have to enable API server (`api.enabled = true` in the app.toml). Single API endpoint is exposed: `http://localhost:1317/metrics?format={text|prometheus}` for v1, or `http://localhost:1318/metrics?format={text|prometheus}` for v2, the default format being `text`.

86-101: Improved presentation of supported metrics

The reformatting of the supported metrics table significantly enhances readability. The clear presentation of metrics, their descriptions, units, and types is very helpful for users.

For consistency, consider capitalizing the first letter of each description in the table.


79-83: Valuable addition on metric idempotency

The new section on idempotency is a crucial addition that addresses the issue mentioned in the PR objectives. It clearly explains that metrics are not idempotent and will be counted multiple times if emitted more than once.

To enhance clarity, consider adding an example:

For example, if a transaction is processed in both `CheckTx` and `DeliverTx`, any metrics emitted during these processes will be counted twice.

Line range hint 1-101: Overall documentation improvements

The updates to this documentation significantly enhance its quality and accuracy. The changes address the issues mentioned in the PR objectives, particularly regarding metric accuracy and multiple counting. The new sections and reformatted content improve readability and provide crucial information for users.

To further improve the documentation, consider adding a section on best practices for implementing telemetry in custom modules. This could include guidelines on choosing appropriate metrics, avoiding high cardinality, and ensuring accurate metric emission in different transaction processing stages.

CHANGELOG.md (4)

Line range hint 1-3: Consider adding a table of contents

For improved navigation, especially in such a long changelog, consider adding a table of contents at the beginning of the file. This would allow readers to quickly jump to specific versions or sections of interest.


Line range hint 7-27: Consider consistent formatting for feature descriptions

For improved readability, consider using consistent formatting for all feature descriptions. For example, start each description with a verb (e.g., "Add", "Implement", "Introduce") and end with a period.


Line range hint 29-64: Consider adding pull request references consistently

For consistency and to provide more context, consider adding pull request references (e.g., #XXXX) for all improvements, not just some. This would allow readers to easily find more details about each change if needed.


Line range hint 66-79: Consider consistent formatting for bug fix descriptions

For improved readability and consistency, consider starting each bug fix description with a verb in the past tense (e.g., "Fixed", "Resolved", "Addressed") to clearly indicate that the issue has been resolved.

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 7fe95fc and e57ea4a.

📒 Files selected for processing (20)
  • CHANGELOG.md (1 hunks)
  • docs/learn/advanced/09-telemetry.md (2 hunks)
  • telemetry/wrapper.go (1 hunks)
  • x/bank/go.mod (1 hunks)
  • x/bank/keeper/msg_server.go (0 hunks)
  • x/bank/v2/keeper/handlers.go (0 hunks)
  • x/distribution/go.mod (1 hunks)
  • x/distribution/keeper/abci.go (1 hunks)
  • x/distribution/keeper/msg_server.go (0 hunks)
  • x/epochs/keeper/abci.go (1 hunks)
  • x/evidence/keeper/abci.go (1 hunks)
  • x/gov/keeper/abci.go (1 hunks)
  • x/group/keeper/abci.go (1 hunks)
  • x/mint/keeper/abci.go (1 hunks)
  • x/protocolpool/keeper/keeper.go (2 hunks)
  • x/slashing/abci.go (1 hunks)
  • x/staking/go.mod (1 hunks)
  • x/staking/keeper/abci.go (1 hunks)
  • x/staking/keeper/msg_server.go (0 hunks)
  • x/upgrade/keeper/abci.go (1 hunks)
💤 Files not reviewed due to no reviewable changes (4)
  • x/bank/keeper/msg_server.go
  • x/bank/v2/keeper/handlers.go
  • x/distribution/keeper/msg_server.go
  • x/staking/keeper/msg_server.go
✅ Files skipped from review due to trivial changes (3)
  • x/bank/go.mod
  • x/staking/go.mod
  • x/staking/keeper/abci.go
🧰 Additional context used
📓 Path-based instructions (12)
CHANGELOG.md (1)

Pattern **/*.md: "Assess the documentation for misspellings, grammatical errors, missing documentation and correctness"

docs/learn/advanced/09-telemetry.md (1)

Pattern **/*.md: "Assess the documentation for misspellings, grammatical errors, missing documentation and correctness"

telemetry/wrapper.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/distribution/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/epochs/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/evidence/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/gov/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/group/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/mint/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/protocolpool/keeper/keeper.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/slashing/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

x/upgrade/keeper/abci.go (1)

Pattern **/*.go: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.

🔇 Additional comments (18)
x/group/keeper/abci.go (1)

5-8: LGTM: Import statements are correctly added and formatted.

The new import statements for types and telemetry packages are necessary for the added telemetry functionality. They follow the Uber Golang style guide recommendations and are grouped correctly.

x/mint/keeper/abci.go (1)

13-14: Improved accuracy of telemetry measurement

The changes improve the accuracy of the telemetry measurement by capturing the start time before the defer statement. This ensures that the duration measurement includes the time taken to set up the defer statement itself, resulting in more precise timing data.

The code adheres to the Uber Golang style guide with clear variable naming and appropriate use of the defer statement.

x/distribution/keeper/abci.go (1)

14-15: Approved: Improved telemetry measurement accuracy

The changes in these lines address the telemetry issue mentioned in the PR objectives. By capturing the start time before the deferred telemetry measurement, we ensure more accurate timing of the BeginBlocker execution. This modification aligns with the goal of simplifying and improving the collected metrics.

The change is minimal, doesn't affect the overall logic of the function, and follows the Uber Golang style guide by improving clarity.

x/evidence/keeper/abci.go (1)

16-17: Improved telemetry measurement accuracy

This change enhances the accuracy of telemetry measurement by capturing the start time at the beginning of the function. It addresses the issue of inaccurate metric values mentioned in #21614 by ensuring consistent timing regardless of how many times the message logic is processed.

Benefits:

  1. More precise timing measurement for the BeginBlocker function.
  2. Simplified and more readable defer statement.
  3. Aligns with the PR objective of simplifying collected metrics.

The change follows the Uber Golang style guide with clear variable naming and straightforward logic.

telemetry/wrapper.go (1)

11-13: Verify the impact of removed constants.

The constants MetricKeyPrepareCheckStater and MetricKeyPrecommiter have been removed. This change aligns with the PR objective of simplifying collected metrics. However, we should ensure that these constants are not used elsewhere in the codebase to avoid any potential issues.

Let's verify that the removed constants are not used elsewhere in the codebase:

If the script returns no results, it confirms that the removed constants are not used elsewhere in the codebase.

✅ Verification successful

Re-verifying the usage of removed constants to ensure they are not used elsewhere in the codebase.


Verification Complete: Removed Constants Are Unused

The constants MetricKeyPrepareCheckStater and MetricKeyPrecommiter are not used elsewhere in the codebase. The changes align with the PR objective of simplifying collected metrics without introducing any issues.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for usage of removed constants in the codebase

# Test: Search for usage of MetricKeyPrepareCheckStater
echo "Searching for MetricKeyPrepareCheckStater usage:"
rg --type go "MetricKeyPrepareCheckStater"

# Test: Search for usage of MetricKeyPrecommiter
echo "Searching for MetricKeyPrecommiter usage:"
rg --type go "MetricKeyPrecommiter"

Length of output: 278


Script:

#!/bin/bash
# Description: Comprehensive check for usage of removed constants in the codebase

# Search for MetricKeyPrepareCheckStater with word boundaries to avoid partial matches
echo "Searching for MetricKeyPrepareCheckStater usage:"
rg --type go --word "MetricKeyPrepareCheckStater"

# Search for MetricKeyPrecommiter with word boundaries to avoid partial matches
echo "Searching for MetricKeyPrecommiter usage:"
rg --type go --word "MetricKeyPrecommiter"

# Alternative search using grep for cross-verification
echo "Alternative search using grep for MetricKeyPrepareCheckStater:"
grep -r --include="*.go" -w "MetricKeyPrepareCheckStater" .

echo "Alternative search using grep for MetricKeyPrecommiter:"
grep -r --include="*.go" -w "MetricKeyPrecommiter" .

Length of output: 1537

x/epochs/keeper/abci.go (1)

14-15: Improved telemetry measurement accuracy

The change enhances the accuracy of the telemetry measurement by capturing the start time before the defer statement. This ensures that the time spent setting up the deferred function is not included in the measurement, providing a more precise duration for the BeginBlocker execution.

This modification aligns well with the PR objectives of simplifying and improving the accuracy of collected metrics. It also adheres to the Uber Golang style guide by using short variable declarations and maintaining clear, simple logic.

x/upgrade/keeper/abci.go (1)

22-23: Improved telemetry measurement accuracy

The change in telemetry measurement logic is an improvement. By capturing the start time immediately before the defer statement, we ensure more accurate timing of the PreBlocker execution. This aligns with the PR objective of simplifying and improving collected metrics.

docs/learn/advanced/09-telemetry.md (1)

26-27: Excellent improvement for metric accuracy

The use of telemetry.Now() for capturing the start time is a significant improvement. This change enhances the accuracy of metric measurement and aligns with the PR objective of simplifying and improving metric collection.

x/distribution/go.mod (1)

94-94: LGTM: Dependency moved to indirect.

The change to move github.com/hashicorp/go-metrics from direct to indirect dependencies aligns with the PR objective of simplifying collected metrics. This suggests that the package is no longer directly imported in the module's code, which is a step towards simplification.

To ensure this change doesn't introduce any issues, please run the following command to verify that there are no remaining direct imports of this package in the module's code:

✅ Verification successful

Verified: Dependency moved to indirect.

No direct imports of github.com/hashicorp/go-metrics found in the x/distribution module. This aligns with the PR objective of simplifying collected metrics.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining imports of github.com/hashicorp/go-metrics in the x/distribution module

# Search for import statements of github.com/hashicorp/go-metrics
rg --type go 'import.*github.com/hashicorp/go-metrics' x/distribution

# If no results are found, the command will exit with a non-zero status code, indicating success
if [ $? -ne 0 ]; then
    echo "No direct imports of github.com/hashicorp/go-metrics found in x/distribution module."
else
    echo "Warning: Found direct imports of github.com/hashicorp/go-metrics. Please review and remove them."
fi

Length of output: 268

x/gov/keeper/abci.go (1)

23-24: Improved telemetry measurement accuracy

The change improves the accuracy of telemetry measurement by capturing the start time at the beginning of the function execution. This ensures that the entire duration of the EndBlocker function is measured correctly.

The modification aligns well with the PR objective of simplifying collected metrics and adheres to the Uber Go Style Guide.

x/protocolpool/keeper/keeper.go (4)

17-17: LGTM: Telemetry import added correctly

The addition of the telemetry import is appropriate for the new functionality being introduced in the BeginBlocker method.


511-511: LGTM: Start time capture added correctly

The addition of start := telemetry.Now() at the beginning of the BeginBlocker method is correct for capturing the start time of the method's execution.


512-512: LGTM: Telemetry measurement added correctly

The deferred call to telemetry.ModuleMeasureSince is correctly implemented to measure the execution time of the BeginBlocker method. The use of defer ensures that the measurement will be taken regardless of how the method exits.


511-513: LGTM: Telemetry successfully added to BeginBlocker

The changes successfully add telemetry to the BeginBlocker method, improving the observability of its performance. The implementation is correct and follows best practices for adding telemetry in Go code.

CHANGELOG.md (4)

Line range hint 1-3: LGTM: Well-structured changelog

The changelog is well-organized and provides detailed information about changes in each version. This is helpful for users and developers to understand the evolution of the software.


Line range hint 7-27: LGTM: Comprehensive feature list

The features section provides a good overview of the new additions to the system. The grouping by components makes it easy to understand which areas of the software have been improved.


Line range hint 29-64: LGTM: Detailed improvements section

The improvements section provides a comprehensive list of enhancements across various components of the system. The inclusion of pull request references for some improvements is particularly helpful.


Line range hint 66-79: LGTM: Clear and informative bug fixes section

The bug fixes section provides a good overview of the issues that have been resolved. The grouping by components and brief descriptions help users understand the nature and impact of each fix.

x/slashing/abci.go Show resolved Hide resolved
@julienrbrt julienrbrt added this pull request to the merge queue Sep 28, 2024
Merged via the queue into main with commit 787ee69 Sep 28, 2024
90 of 91 checks passed
@julienrbrt julienrbrt deleted the julien/simplify-metrics branch September 28, 2024 11:26
mergify bot pushed a commit that referenced this pull request Sep 28, 2024
(cherry picked from commit 787ee69)

# Conflicts:
#	CHANGELOG.md
#	x/bank/v2/keeper/handlers.go
#	x/mint/keeper/abci.go
julienrbrt added a commit that referenced this pull request Sep 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: telemetry in msg handlers runs multiple times leading to unexpected metric values