-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added checks for checking test containers running or not to avoid test flakeness #2303
Conversation
3644ead
to
fca5843
Compare
} | ||
|
||
// Store the DB connection for reuse | ||
ms.db = db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing the db
field? Can we still keep it and get it from pingDatabase
so that we don't need to open it again?
Also, I think pingDatabase
with #retries in the Start()
function might not be useful that much because if the db
is connected in the starting there is still some chance of failure in further Functions/Queries. So, maybe we need a retry at that level?
what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think pingDatabase with #retries in the Start() function might not be useful
Yes thats why if you check ExecuteSqls()
there i am creating a new connection/db everytime.
The failures we saw were like - unable to connect...
So i thought we create a new connection at that moment then it will helpful.
Previously we were creating and storing the connection pool sql.DB
which might go bad after some usage.
Btw this PR changes i have rerun the tests numerous times, didn't get the errors we used to. but still not 100% sure if it can reoccur or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand, and that's my point. Since this pingDatabase
is not required in Start(),
we can remove it and instead have similar logic in the other functions of each container. For example, when we run queries on the database using a connection, we should have some retry logic so that if there is a failure after initial queries, we can retry, and it won't fail directly.
what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added step to ping database
in Start()
just to ensure that the database container is not just created or up, it is also running properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the retry logic - we can have that.
But i think for that we should do that based on certain error messages, not for every error. So will need to figure out the list of error then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, okay. For now, I think we can go ahead with this change and revisit the retry logic in the container functions if we start facing intermittent issues in that path.
// It retries for a few times with a delay before giving up. | ||
func pingDatabase(driver string, connStr string) error { | ||
var err error | ||
maxRetries := 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we increase the retries to 5?
utils.ErrExit("mysql container is not started: nil") | ||
} | ||
|
||
db, err := sql.Open("mysql", ms.GetConnectionString()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you keep the DB in the Container struct, you can try it here first. If db Exec is working, then retry the connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
} | ||
|
||
// Store the DB connection for reuse | ||
ms.db = db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, okay. For now, I think we can go ahead with this change and revisit the retry logic in the container functions if we start facing intermittent issues in that path.
Describe the changes in this pull request
Describe if there are any user-facing changes
How was this pull request tested?
Does your PR have changes that can cause upgrade issues?