-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix mysql sink may be blocked by network io wait #825
Conversation
/run-integration-tests |
Codecov Report
@@ Coverage Diff @@
## master #825 +/- ##
================================================
- Coverage 32.4760% 32.4663% -0.0098%
================================================
Files 97 97
Lines 10848 10765 -83
================================================
- Hits 3523 3495 -28
+ Misses 6991 6934 -57
- Partials 334 336 +2 |
/run-integration-tests |
cdc/sink/mysql.go
Outdated
for _, w := range s.workers { | ||
w.waitAllTxnsExecuted() | ||
} | ||
done <- struct{}{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done <- struct{}{} | |
close(done) |
cdc/sink/mysql.go
Outdated
s.notifier.Notify() | ||
for _, w := range s.workers { | ||
w.waitAllTxnsExecuted() | ||
done := make(chan struct{}, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done := make(chan struct{}, 1) | |
done := make(chan struct{}) |
/merge |
@leoppro Oops! This PR requires at least 2 LGTMs to merge. The current number of |
/run-integration-tests |
/lgtm |
/merge |
/merge |
/run-all-tests |
/merge |
1 similar comment
/merge |
/run-all-tests |
What problem does this PR solve?
In a test when TiDB cluster is abnormal, ticdc executes DMLs with some errors, but one of goroutine is hang up and blocks the whole sink
What is changed and how it works?
Use a hack code to avoid io wait in some routine blocks others to exit.
As the network io wait is blocked in kernel code, the goroutine is in a
D-state that we could not even stop it by canceling the context. So if this
scenario happens, the blocked goroutine will be leak.
Check List
Tests
Release note