-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(db/trans-cache): optimize for bloomFilter initialization #5394
feat(db/trans-cache): optimize for bloomFilter initialization #5394
Conversation
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. @@ Coverage Diff @@
## develop #5394 +/- ##
=============================================
+ Coverage 60.89% 60.94% +0.04%
- Complexity 9231 9246 +15
=============================================
Files 839 839
Lines 50029 50140 +111
Branches 5574 5575 +1
=============================================
+ Hits 30467 30558 +91
- Misses 17176 17194 +18
- Partials 2386 2388 +2
... and 3 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
throw new RuntimeException(e); | ||
} | ||
try (OutputStream out = new BufferedOutputStream(Files.newOutputStream(file, | ||
StandardOpenOption.CREATE_NEW, StandardOpenOption.WRITE))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that OS cache flushing to the disk is not immediate, is it possible that the property file flushed successfully but partly of bloomFilters is still in the OS cache(then corrupt when OS crashed)?
Add another option StandardOpenOption.SYNC
? Or write all data into one file and make sure the metadata at the end also can achieve the atomic, but this will introduce complexity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that OS cache flushing to the disk is not immediate, is it possible that the property file flushed successfully but partly of bloomFilters is still in the OS cache(then corrupt when OS crashed)?
Yeah, it's gonna happen. But it doesn't seem to be a problem, it will rollback to previous mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add another option StandardOpenOption.SYNC? Or write all data into one file and make sure the metadata at the end also can achieve the atomic, but this will introduce complexity
This PR is a helper feature. StandardOpenOption.SYNC
is good , But it may slow down the closing procedure if you're so concerned about closing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does StandardOpenOption.SYNC
take?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
300ms -> 1.2 s for ssd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
700ms -> +15s for non-ssd, StandardOpenOption.SYNC
is not a good choice!
|
||
private boolean recovery(int index, Path file) { | ||
try (InputStream in = new BufferedInputStream(Files.newInputStream(file, | ||
StandardOpenOption.READ, StandardOpenOption.DELETE_ON_CLOSE))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DELETE_ON_CLOSE
means the bloomFilters files only work once, the situation that node recovery finished but shutdown immediately is ignored, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kill -9
or kill -15
? if kill -15
, bloomFilters will be dump again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the situation that node recovery finished but shutdown immediately is ignored, right?
Sorry, i've resubmitted it to avoid dump wrong cache, please recheck it.
@@ -42,6 +56,7 @@ public class TxCacheDB implements DB<byte[], byte[]>, Flusher { | |||
private BloomFilter<byte[]>[] bloomFilters = new BloomFilter[2]; | |||
// filterStartBlock record the start block of the active filter | |||
private volatile long filterStartBlock = INVALID_BLOCK; | |||
private volatile long currentBlockNum = INVALID_BLOCK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A check for currentBlockNum
with block header
is needed?
Node shutdown using Kill -9
may result in DBS being inconsistent, so this PR only works with a graceful shutdown? Just for a confirm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A check for currentBlockNum with block header is needed?
Good question, but how might it be inconsistent? Yes ,a graceful shutdown is required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions, with no relation. A check
just for a double check, seems it introduces no more benefit.
how might it be inconsistent
The whole shutdown process is complicated, still thinking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the empty blocks will make different, a check is not required.
chainbase/src/main/java/org/tron/core/db2/common/TxCacheDB.java
Outdated
Show resolved
Hide resolved
@@ -110,6 +125,9 @@ private void initCache() { | |||
} | |||
|
|||
public void init() { | |||
if (recovery()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Recovery failure will affect the initialization logic
- Some exceptions are caught, and some are not. Why is this happening
- Logically speaking, as long as there is an exception, it is a failure, so there is no need to handle those exceptions, so you only need to handle the exception at the outermost layer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated!
"txCache.properties"); | ||
Map<String, String> properties = readProperties(txCacheProperties); | ||
|
||
if (properties == null || properties.size() != 3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If properties are read successfully but bloom fails, will there be any problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If new fields are added to the properties database in the future, it will have an impact
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If properties are read successfully but bloom fails, will there be any problem?
It will fail and fallback to the original loading mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If new fields are added to the properties database in the future, it will have an impact
Wouldn't consider it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated!
45840fc
to
8566f5a
Compare
8566f5a
to
7d1eac9
Compare
close #5355