-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MINOR] Fix usages of orElse #10435
[MINOR] Fix usages of orElse #10435
Conversation
@@ -1016,7 +1016,7 @@ private List<String> getInstantsToRollbackForLazyCleanPolicy(HoodieTableMetaClie | |||
@Deprecated | |||
public boolean rollback(final String commitInstantTime, Option<HoodiePendingRollbackInfo> pendingRollbackInfo, boolean skipLocking) throws HoodieRollbackException { | |||
final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()) | |||
.orElse(createNewInstantTime(!skipLocking)); | |||
.orElseGet(() -> createNewInstantTime(!skipLocking)); | |||
return rollback(commitInstantTime, pendingRollbackInfo, rollbackInstantTime, skipLocking); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so #orElseGet
is always preferrable than #orElse
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, you do not want to execute methods or create objects you will not use. Therefore you can use orElse
when returning a constant but otherwise you should avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
@@ -1016,7 +1016,7 @@ private List<String> getInstantsToRollbackForLazyCleanPolicy(HoodieTableMetaClie | |||
@Deprecated | |||
public boolean rollback(final String commitInstantTime, Option<HoodiePendingRollbackInfo> pendingRollbackInfo, boolean skipLocking) throws HoodieRollbackException { | |||
final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()) | |||
.orElse(createNewInstantTime(!skipLocking)); | |||
.orElseGet(() -> createNewInstantTime(!skipLocking)); | |||
return rollback(commitInstantTime, pendingRollbackInfo, rollbackInstantTime, skipLocking); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
@@ -107,23 +107,19 @@ object HoodieSparkUtils extends SparkAdapterSupport with SparkVersionsSupport wi | |||
// injecting [[SQLConf]], which by default isn't propagated by Spark to the executor(s). | |||
// [[SQLConf]] is required by [[AvroSerializer]] | |||
injectSQLConf(df.queryExecution.toRdd.mapPartitions { rows => | |||
if (rows.isEmpty) { | |||
Iterator.empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does removal of this provide any benefit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this will trigger the dag to see if it is in fact returning an empty set of rows. You can see this on the spark UI when running your jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking at the wrong place. This is the spot that needed to be updated: https://github.com/apache/hudi/pull/10435/files#diff-21ccee08cebace9de801e104f356bf4333c017d5d5c0cac4e3de63c7861f3c13R100
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
InputBatch inputBatch = readFromSource(instantTime, metaClient); | ||
LOG.error("Time to read from source : " + (System.currentTimeMillis() - startInput)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use HoodieTimer
to track execution time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was from my own debugging of slow tests. I will revert
@@ -448,7 +450,9 @@ public Pair<Option<String>, JavaRDD<WriteStatus>> syncOnce() throws IOException | |||
} | |||
} | |||
|
|||
long startWrite = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar here on using HoodieTimer
and below.
hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Change Logs
There are places in the codebase where the code uses
orElse
when it should beorElseGet
to avoid running some computation or creation of unnecessary objects.Impact
Reduce amount of objects created and computation done
Risk level (write none, low medium or high below)
none
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist