Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(db/trans-cache): optimize for bloomFilter initialization #5394

Merged

Conversation

halibobo1205
Copy link
Contributor

close #5355

@codecov-commenter
Copy link

Codecov Report

Merging #5394 (eab8d2c) into develop (10223cc) will increase coverage by 0.04%.
The diff coverage is 83.03%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@              Coverage Diff              @@
##             develop    #5394      +/-   ##
=============================================
+ Coverage      60.89%   60.94%   +0.04%     
- Complexity      9231     9246      +15     
=============================================
  Files            839      839              
  Lines          50029    50140     +111     
  Branches        5574     5575       +1     
=============================================
+ Hits           30467    30558      +91     
- Misses         17176    17194      +18     
- Partials        2386     2388       +2     
Files Changed Coverage Δ
.../main/java/org/tron/core/db2/common/TxCacheDB.java 79.90% <83.03%> (-3.59%) ⬇️

... and 3 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@halibobo1205
Copy link
Contributor Author

@wangzichichi PTAL, https://buildkite.com/tronprotocol/build-on-debian-9-dot-8/builds/4848

throw new RuntimeException(e);
}
try (OutputStream out = new BufferedOutputStream(Files.newOutputStream(file,
StandardOpenOption.CREATE_NEW, StandardOpenOption.WRITE))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that OS cache flushing to the disk is not immediate, is it possible that the property file flushed successfully but partly of bloomFilters is still in the OS cache(then corrupt when OS crashed)?

Add another option StandardOpenOption.SYNC? Or write all data into one file and make sure the metadata at the end also can achieve the atomic, but this will introduce complexity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that OS cache flushing to the disk is not immediate, is it possible that the property file flushed successfully but partly of bloomFilters is still in the OS cache(then corrupt when OS crashed)?

Yeah, it's gonna happen. But it doesn't seem to be a problem, it will rollback to previous mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add another option StandardOpenOption.SYNC? Or write all data into one file and make sure the metadata at the end also can achieve the atomic, but this will introduce complexity

This PR is a helper feature. StandardOpenOption.SYNC is good , But it may slow down the closing procedure if you're so concerned about closing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does StandardOpenOption.SYNC take?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

300ms -> 1.2 s for ssd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

700ms -> +15s for non-ssd, StandardOpenOption.SYNC is not a good choice!


private boolean recovery(int index, Path file) {
try (InputStream in = new BufferedInputStream(Files.newInputStream(file,
StandardOpenOption.READ, StandardOpenOption.DELETE_ON_CLOSE))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DELETE_ON_CLOSE means the bloomFilters files only work once, the situation that node recovery finished but shutdown immediately is ignored, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kill -9 or kill -15 ? if kill -15 , bloomFilters will be dump again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the situation that node recovery finished but shutdown immediately is ignored, right?

Sorry, i've resubmitted it to avoid dump wrong cache, please recheck it.

@@ -42,6 +56,7 @@ public class TxCacheDB implements DB<byte[], byte[]>, Flusher {
private BloomFilter<byte[]>[] bloomFilters = new BloomFilter[2];
// filterStartBlock record the start block of the active filter
private volatile long filterStartBlock = INVALID_BLOCK;
private volatile long currentBlockNum = INVALID_BLOCK;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A check for currentBlockNum with block header is needed?

Node shutdown using Kill -9 may result in DBS being inconsistent, so this PR only works with a graceful shutdown? Just for a confirm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A check for currentBlockNum with block header is needed?

Good question, but how might it be inconsistent? Yes ,a graceful shutdown is required

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions, with no relation. A check just for a double check, seems it introduces no more benefit.

how might it be inconsistent

The whole shutdown process is complicated, still thinking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the empty blocks will make different, a check is not required.

@halibobo1205 halibobo1205 self-assigned this Aug 10, 2023
@@ -110,6 +125,9 @@ private void initCache() {
}

public void init() {
if (recovery()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Recovery failure will affect the initialization logic
  2. Some exceptions are caught, and some are not. Why is this happening
  3. Logically speaking, as long as there is an exception, it is a failure, so there is no need to handle those exceptions, so you only need to handle the exception at the outermost layer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

"txCache.properties");
Map<String, String> properties = readProperties(txCacheProperties);

if (properties == null || properties.size() != 3) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If properties are read successfully but bloom fails, will there be any problem?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If new fields are added to the properties database in the future, it will have an impact

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If properties are read successfully but bloom fails, will there be any problem?

It will fail and fallback to the original loading mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If new fields are added to the properties database in the future, it will have an impact

Wouldn't consider it for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

@halibobo1205 halibobo1205 requested a review from jwrct August 14, 2023 11:55
@halibobo1205 halibobo1205 force-pushed the feat/bloom_flter_init_opt branch from 45840fc to 8566f5a Compare August 14, 2023 17:38
@halibobo1205 halibobo1205 force-pushed the feat/bloom_flter_init_opt branch from 8566f5a to 7d1eac9 Compare August 14, 2023 17:40
@halibobo1205 halibobo1205 merged commit f20e11b into tronprotocol:develop Aug 21, 2023
@halibobo1205 halibobo1205 changed the title feat(db): optimize for bloomFilter initialization feat(db/trans-cache): optimize for bloomFilter initialization Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the transaction-cache initialization faster
5 participants