-
Notifications
You must be signed in to change notification settings - Fork 342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
External Authentication Providers Support for Cody #6526
Conversation
068277d
to
5bf3cd4
Compare
9fc4bcf
to
129682b
Compare
@MaedahBatool FYI this PR description and test plan are gold for the documentation we'll need to write for this feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will send more feedback but here's a first packet so you don't have to wait any longer. Sorry for the slow reply on this! Looks great but I am keen to read every detail.
agent/src/agent.ts
Outdated
@@ -1500,7 +1499,7 @@ export class Agent extends MessageHandler implements ExtensionClient { | |||
}, | |||
auth: { | |||
serverEndpoint: config.serverEndpoint, | |||
accessToken: config.accessToken ?? null, | |||
accessTokenOrHeaders: config.accessToken || null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This treats empty strings differently, is that relevant? (Why did we ??
in the first place?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I does not really matter, I'm not sure why I changed it, reverted now.
lib/shared/src/configuration.ts
Outdated
tokenSource?: TokenSource | undefined | ||
accessTokenOrHeaders: string | AuthHeaders | null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can structure this better.
Does having headers versus a token imply something about the tokenSource
?
If we only have token sources for tokens, let's make a sub object with { accessToken: string, tokenSource: TokenSource }
which gets rid of the subtle duplication of accessToken
's | null
and tokenSource
's | undefined
(...do we ever have a token without a source?)
If we can have TokenSource for headers, then let's consider renaming tokenSource
/TokenSource
, maybe something like source: CredentialSource
.
This is AuthCredentials
, so can we simply call the important field credentials
?
So maybe something like:
export interface AuthCredentials {
serverEndpoint: string
credentials: HeaderCredential | TokenCredential | undefined
}
export interface HeaderCredential {
headers: Record<string, string>,
}
export interface TokenCredential {
token: string,
source: TokenSource,
}
Then the call sites will read something like:
if (auth.credentials) {
... ok we have some creds of some kind ...
}
and when we get down to the business, then auth.credentials.token
and so on...
We could even add type: 'headers'
, type: 'token'
if there's a lot of dispatch on the credential type so you can use switch
if that makes things readable.
⭐ Do header credentials need an expiry? IIRC the script was going to give us a time to use the credentials for...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea regarding interface! It will be cleaner indeed.
As for credentials expiry... that is a good question.
I can add required expiration field to the JSON which should be returned by a command, but I'm not sure what would be best way to invalidate it. Just spawn some worker which will wait until exp time and then force auth refresh? That could work, although does not sound too elegant.
Alternative would be to automatically refresh auth on 403
error from the server, but it would also require some exponential backoff and it would be not up to the spec anyway.
If you don't mind I created https://linear.app/sourcegraph/issue/CODY-4663/add-support-for-the-token-expiration-date to address it in a followup to not dump everything in one huge PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored interface to the one you suggested.
lib/shared/src/configuration.ts
Outdated
@@ -166,6 +182,9 @@ interface RawClientConfiguration { | |||
*/ | |||
overrideServerEndpoint?: string | undefined | |||
overrideAuthToken?: string | undefined | |||
|
|||
// External auth providers | |||
authExternalProviders?: ExternalAuthProvider[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need to be optional if we could just make it []
by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to be just []
lib/shared/src/configuration.ts
Outdated
@@ -166,6 +182,9 @@ interface RawClientConfiguration { | |||
*/ | |||
overrideServerEndpoint?: string | undefined | |||
overrideAuthToken?: string | undefined | |||
|
|||
// External auth providers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think this comment is adding anything that isn't in the field and type name, so remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
env: cmd.environment ? { ...process.env, ...cmd.environment } : process.env, | ||
} | ||
|
||
const { stdout, stderr } = await execAsync(command, options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test plan talks about timeouts, where is that happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here:
const options = {
...cmd,
It works because my exec is matching node exec API, including option names. But it is hard to spot, if you think it makes sense I can just reassign all options directly so it will be clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it explicit.
vscode/package.json
Outdated
}, | ||
"timeout": { | ||
"type": "number", | ||
"description": "Timeout for executing the command in miliseconds." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"description": "Timeout for executing the command in miliseconds." | |
"description": "Timeout for executing the command in milliseconds." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've read the whole thing:
- We should not store these creds on disk, and it would be great if the types, and runtime made it really clear that won't happen.
- We can make that AuthCredentials type much cleaner although you don't need to hew to my exact suggestion of course. There's probably something more elegant you can come up with.
- Let's think about future directions. Like what if the script wants to return instructions to us about credential handling? Could we put the headers in a field,
headers
, so there's room to dump other things in the JSON output in future? - What about credential expiries?
- In practice, if the proxy kicks you out, how good are we at echoing error message status text from the proxy? Fine if we look at that in a separate PR but I'm curious what you found so far.
} | ||
|
||
const { stdout, stderr } = await execAsync(command, options) | ||
if (stderr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use exit codes, not error output, to measure success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
clientConfiguration.overrideServerEndpoint || endpoint | ||
) | ||
|
||
// We must not throw here, because that would result in the `resolvedConfig` observable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you're just moving this, but the situation is a bit 😱
The comment says we must not throw, but we will throw if we end up on line 136 via line 150. No need to address it here but I'm interested in what you think about this.
accessTokenOrHeaders = authHeaders | ||
tokenSource = 'custom-auth-provider' | ||
} else { | ||
accessTokenOrHeaders = (await loadTokenFn()) || null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Client does not want us to ever persist these tokens, by the way. Anything we can do at the type level to prove that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, but I will add some tests to prove it works.
@@ -39,10 +40,9 @@ export class SourcegraphBrowserCompletionsClient extends SourcegraphCompletionsC | |||
...requestParams.customHeaders, | |||
} as HeadersInit) | |||
addCodyClientIdentificationHeaders(headersInstance) | |||
addAuthHeaders(config.auth, headersInstance, url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 💯
await secretStorage.storeToken( | ||
serverEndpoint, | ||
credentials.accessToken, | ||
credentials.accessTokenOrHeaders, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The particular customer for the first iteration of this feature wants us to never store these creds on disk.
It would be great to use types somehow to make a credible claim about that.
Maybe we could put something that isn't JSON serializable, like a Symbol, as the field value and store a WeakMap from Symbol -> actual creds which we look up when we're writing headers. That would be a pretty compelling runtime proof that we don't store the credential in the access point history, because the credential simply isn't in this structure and even its placeholder can't be written to disk. (...although it wouldn't say anything about whether the network stack is being logged to disk or whatever...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added test for a serialization.
I wan't able to prevent it using constructs with a symbol, but I ended up with:
export interface HeaderCredential {
// We use function instead of property to prevent accidential top level serialization - we never want to store this data
getHeaders(): Record<string, string>
expiration: number | undefined
}
This way if someone will try to serialise HeaderCredential
it will only serialise function name, but never the content.
68dd847
to
00022d1
Compare
00022d1
to
1c4e54a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions inline.
agent/scripts/reverse-proxy.py
Outdated
# Use value of 'Authorization: Bearer' to fill 'X-Forwarded-User' and remove 'Authorization' header | ||
if 'Authorization' in headers: | ||
values = headers['Authorization'].split() | ||
if values and values[0] == 'Bearer': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can still fail if the header is just "Bearer", and it truncates the value if it contains spaces. Simpler to just use a Regex:
m = re.match('Bearer (.*)', headers['Authorization'])
if m:
headers['X-Forwarded-User'] = m.group(1)
|
||
# Reset the Host header to use target server host instead of the proxy host | ||
if 'Host' in headers: | ||
headers['Host'] = urlparse(target_url).netloc.split(':')[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ancient URL archaeology but with credentials the netloc is user:pass@host:port
, but I guess we don't care about that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, since this is solely for testing purposes I do not think we care too much
agent/scripts/reverse-proxy.py
Outdated
|
||
|
||
if __name__ == '__main__': | ||
print('Usage: python reverse_proxy.py [target_url] [proxy_port]') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth printing here that target_url
should (probably) not end in a slash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using argparse, it is only slightly longer and more readable:
import argparse
...
parser = argparse.ArgumentParser(description='External auth provider test proxy server')
parser.add_argument('target_url', help='Target Sourcegraph instance URL to proxy to')
parser.add_argument('proxy_port', type=int, nargs='?', default=8000,
help='Port for the proxy server (default: %(default)s)')
args = parser.parse_args()
target_url = args.target_url
port = args.proxy_port
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestion regarding argparse, it looks much better with it!
As for target_url
I will just trim it instead of forcing user to provide it in a narrowly defined format (but you are right it would cause an issue if someone would try with '/' at the end...)
@@ -0,0 +1,68 @@ | |||
from aiohttp import web, ClientSession |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a shebang and make it executable.
Add a brief comment or module doc comment explaining that this demonstrates using external authentication providers. Link to the docs for the X-Forwarded-User configuration would be great. And gentle disclaimer that it is for testing or demoing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done :)
@@ -0,0 +1,97 @@ | |||
//@ts-nocheck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
{ | ||
authExternalProviders: [ | ||
{ | ||
endpoint: 'my-server.com', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this realistic? Endpoints aren't hosts, but they're URLs like https://sourcegraph.com, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point it's just a string, but I will change it to something more realistic :)
endpoint: 'my-server.com', | ||
executable: { | ||
commandLine: [ | ||
'echo \'{ "headers": { "Authorization": "token X" }, "expiration": 1337 }\'', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very neat lightweight external auth provider :)
|
||
const command = cmd.commandLine.join(' ') | ||
|
||
// No need to check error code, promisify causes exec to throw in case of errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth cody logging stderr to help debugging problems with the provider? Or does execAsync
do anything helpful here like bundle up stderr into the rejection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does contain the error:
█ resolveConfiguration Error resolving configuration: Error: Failed to execute external auth command: Error: Command failed: eco '{ "headers": { "Authorization": "Bearer some_user" }, "expiration": 1736851996 }'
/bin/bash: eco: command not found
shell: cmd.shell, | ||
timeout: cmd.timeout, | ||
windowsHide: cmd.windowsHide, | ||
env: cmd.environment ? { ...process.env, ...cmd.environment } : process.env, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can splice from undefined, null, etc. without problems (at least in JavaScript... if TypeScript complains I guess you could (cmd.environment || {})
.)
env: cmd.environment ? { ...process.env, ...cmd.environment } : process.env, | |
env: { ...process.env, ...cmd.environment }, |
d7862ea
to
1964f05
Compare
1964f05
to
92c47ec
Compare
Fixes https://linear.app/sourcegraph/issue/CODY-4648 ## Changes This PR improves error reporting if auth configuration is broken. ![image](https://github.com/user-attachments/assets/96766d33-6070-4bf0-95ef-82a350a9a16d) ## Test plan 1. Configure custom auth provider as described in #6526 2. Change config to be invalid, e.g. in the `commandLine` change `echo` to `echox` and save a config. 3. You should get logged out and see a detailed error message describing what is wrong. 4. Fix the issue and save the config, error should disappear and you should be logged in back.
) ## Changes That PR eliminates annoying chat reload upon an token refresh if external auth provider is used. It does so by splitting external auth provider evaluation into two steps which happens at different stages: config resolution and http request building. At config resolution we only evaluate config itself and prepare a function which will be later used to generate auth headers later. That function (`getHeaders`) is then used to obtain auth headers every time we are doing an authorised http request. Normally headers are cached, but if they expire `getHeaders` should internally refresh the cache and return an updated headers. If updated headers are still expired error will be shown to the user. In opposition to the previous solution errors in executing external provider command, or token expiration, does not lead to the config invalidation - config always remains the same unless changed by the user. The only thing which changes is result of `getHeaders` functions. Errors from `getHeaders` are processed and handled in the same way as e.g. `Network Error` or `Invalid Access Token`, so we do not need custom code to handle them. ![image](https://github.com/user-attachments/assets/86298a35-3aa6-4b87-9fd2-edb1689e08d2) ![image](https://github.com/user-attachments/assets/3b853a14-a6f7-44af-a50e-2d3e6db6f24f) ## Test plan **Setup** Set `cody.auth.externalProviders` config to use provider which support credentials expiration. You can use configuration shown bellow. Please also make sure you have proxy running and your sourcegraph server have http auth proxy enabled, as described in #6526. ```json { "cody.auth.externalProviders": [ { "endpoint": "http://localhost:7777", "executable": { "commandLine": ["/Users/pkukielka/Work/sourcegraph/cody2/agent/scripts/simple-external-auth-provider.py"], "shell": "/bin/bash", } } ], "cody.override.serverEndpoint": "http://localhost:7777", } ``` **Scenario 1:** 1. Start cody, if you used `simple-external-auth-provider.py` should be successfully signed-in as `someuser` 2. Test chat and autocompletions - everything should work 3. Open `simple-external-auth-provider.py` and change `current_epoch = int(time.time()) + 30` to `current_epoch = int(time.time()) - 30` 4. Wait 30 sec and do some Cody action, e.g. ask a chat question 5. You should get an error, but your conversation should be preserved 6. Revert your changes to `simple-external-auth-provider.py` 7. Ask the question again - chat should be working again Scenario 2: 1. Start cody, if you used `simple-external-auth-provider.py` should be successfully signed-in as `someuser` 2. Test chat and autocompletions - everything should work 3. Open `simple-external-auth-provider.py` and change `current_epoch = int(time.time()) + 30` to `current_epoch = int(time.time()) - 30` 4. Quickly do some Cody actions before 30 sec will pass. They should be successful, but when 30 sec will pas since last action before the `simple-external-auth-provider.py` edit, you should get an credential expiration error 5. Revert your changes to `simple-external-auth-provider.py` 6. Make sure Cody works fine again Scenario 3: 1. Open `simple-external-auth-provider.py` and change `current_epoch = int(time.time()) + 30` to `current_epoch = int(time.time()) - 30` 2. Start Cody 4. You should see a credential expiration error 5. Click 'Sign In' - it should try to sign you in but eventually fail 6. Revert changes to `simple-external-auth-provider.py` 7. Click 'Sign In' - it should succeed 8. Make sure Cody chat and autocompletions works <!-- Required. See https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles. -->
This reverts commit 1ed8392.
This reverts commit 1ed8392.
Fixes https://linear.app/sourcegraph/issue/CODY-4642
Fixes https://linear.app/sourcegraph/issue/CODY-4663
External Authentication Provider Support for Cody
This PR introduces support for external authentication providers in Cody, allowing users to integrate with custom authentication proxies and handle complex authentication scenarios.
Feature Overview
This feature requires clients to have reverse proxy and custom sourcegraph instance configured to use HTTP authentication.
The external authentication provider feature allows clients to generate, for a specified endpoint, custom auth headers. Those headers will be attached to every authenticated http request instead of the normal
"Authorization": "token sgp_SOME_TOKEN"
auth header.To generate those custom headers client need to specify command that generates authentication headers for specific endpoints. The command must output a JSON object containing header key-value pairs on stdout. Those endpoints URLs needs to point to proxies configured by client which redirects requests to the custom sourcegraph instance.
Whole flow looks like this:
X-Forwarded-User
and/orX-Forwarded-Email
headers as specified in the documentationConfiguration
Users can configure custom authentication providers in their vscode settings.json using the following structure:
It can also be configured in IntelliJ using settings editor:
User can define as many external providers as needed.
If only one provider is needed and login using this provider should be forced, it will be possible to accomplish using
cody.override.serverEndpoint
.Configuration Options
Expected Output
Script or executable specified in the configuration have to return valid JSON object which adheres to the schema:
Where:
headers
(Required) [Map with string keys and values] - headers which will be attached to authenticated requests to the http proxyexpiration
(Optional) [a number] - Epoch Unix Timestamp (UTC) of the headers expiration date; after expiration date headers will be re-generated automatically using configured commandTesting Locally
Start the provided reverse proxy:
python agent/scripts/reverse-proxy.py https://your-sourcegraph-instance.com 5555
You can choose different port or start a few different proxies for different endpoints.
Add the proxy configuration to your settings:
http://localhost:5555
endpointSecurity Considerations
Missing features
Test plan
Testing Locally
section.Changelog