-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
All commands timeout after a couple hours #16
Comments
More details from today. I had an app running overnight which successfully renewed leases all evening. This morning I tried to shut the app down which revokes the user. This was not successful. Here is the end of the log.
At this point forward I am no longer able to create new credentials. |
We have what appears to be the same issue. Additionally, we cannot revoke the lease though matter what we try (force doesn't help). |
I've been working with @gerrat on this problem. We updated to vault 1.1.2 and 0.1.5 on the plugin, and it still did not help with revoking the lease. At this point we think we've pinned the bug down as a permissions issue for the backend connection's user: We were able to select sessions but not revoke them. The connection user needs to be able to execute both of these:
It's the second one we don't seem to have. |
I'm going to hazard the following guess at what happens in the plugin, though I don't have the specific setup needed to reproduce. I see that in the After that it attempts to disconnect the session. With our permissions setup it happily selects the sessions, and then gets a permission denied when trying to revoke. At that point it either never returns, or throws an error without releasing the lock. I'm betting the second is the case. If I'm correct, this error only presents when you don't have the So the workaround would be to:
|
The defer statement means it releases the lock whenever the function returns, whether success or not. |
I've been learning my way through go syntax and capabilities over the past few days and I saw that pretty quickly - though I wasn't sure how it behaved exactly in this context. Thanks for the confirmation. We've still got this problem, even with what appear to be correct permissions at this point. We definitely need to restart vault after disconnecting sessions, so the lock is not being released. If the only way it doesn't release is if the function doesn't exit, then I'm lead to believe that the oracle client is never returning. Perhaps this is a mismatch in the oracle client version... I realize we've been using the pre-built linux binary (with version 12.2?) and our vault server has the 12.1 client installed. @jefferai I can't find anything that definitively says what client the pre-built binary uses. The README suggests that it is 11.2, and the build script looks like it is 12.2. Can you confirm that it is the 12.2 oracle client (at least for the v0.1.5 of the plugin)? @jjathman have you been using the pre-built binary, or did you build the plugin against the exact version your vault server is using? |
What makes you so sure that the issue is the lock? We've seen similar types of errors with configurations either on a firewall or the third party server (in this case Oracle) being configured to drop a connection that seems idle. Especially in the case of a firewall, dropping often means black-holing traffic rather than rejecting a connection outright (as once it stops tracking the connection it treats it as any other unauthorized connection, which often means black-holing). This can lead the client to keep sending packets that will never go anywhere and keep retrying when it gets no response, which causes the client to essentially hang. You could also reload the plugin rather than restarting Vault entirely. |
I've been thinking the lock makes sense as an explanation for why no other transaction is possible after the initial attempt, even though we see vault re-trying the revocation. We enabled some auditing on the database, and the only action done by the connection user is the initial Your explanation could make sense as well, though there shouldn't be an idle timeout between the I don't actually think the lock is the problem, just a symptom of the problem. I'm continuing to dig into this, and I'll let you know how it goes, but it's slow going since I'm not familiar with go... Your help is much appreciated. Thanks! |
I've pinned our version of this down. I added a bunch of logging in to see how far through I get, and I can see it claim the lock, execute the session I hacked together a single run-through which assumed exactly one session existed, and instead of deferring So it appears the client won't accept another call until the rows have been closed. I've verified this with 12.1 and 12.2. |
After reading up on the database package, I found an explanation here that shows how connections should work during a loop like the one we're using: rows, err := db.Query("select * from tbl1") // Uses connection 1
for rows.Next() {
err = rows.Scan(&myvariable)
// The following line will NOT use connection 1, which is already in-use
db.Query("select * from tbl2 where id = ?", myvariable)
} However, in this case, prior to entering But, the default value for max_open_connections is 2 Updating our configuration to use Is there a place that would be good to document that this plugin requires a higher value to be set in order to revoke active sessions? Is the README here sufficient? |
If the default |
I've opened a ticket upstream. The sdk changes will need to be pulled in here once that is addressed. Documentation here will still help in the meantime. |
Just checking back in, is this issue fixed now? I see there was a release in November. Sorry I haven't had an opportunity to try this out myself yet. |
Provided you are running a version of vault >= 1.2.0, you should no longer need to set Assuming that is the root cause of your timeouts, I believe this to be fixed upstream. |
OK great, that matches my expectations as well. I will just close this for now then. I can always reopen it if I see there is an issue. |
I'm just starting trying to use Vault with the Oracle DB plugin. I'm able to get it configured and working correctly. Vault can create new users, I'm able to renew the leases for those users, and I am able to revoke them.
However, after a couple hours all commands that interact with the Oracle plugin start timing out and failing with RPC errors. The Oracle plugin process is still running. I'm not sure if the issue is within the vault plugin process, or vault itself.
Hoping for some help troubleshooting what the issue could be. I don't see anything in the logs to indicate there is a problem.
Restarting vault (and the plugin process) always immediately solves the problem.
Vault version 1.0.1
Plugin version 0.1.4
The text was updated successfully, but these errors were encountered: