Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[please test] BYOK with ollama #342

Open
olegklimov opened this issue Oct 2, 2024 · 6 comments
Open

[please test] BYOK with ollama #342

olegklimov opened this issue Oct 2, 2024 · 6 comments

Comments

@olegklimov
Copy link
Contributor

With the ollama project it's easy to host our own AI models.

You can set up bring-your-own-key (BYOK) to connect to ollama server, and see if you can use StarCoder2 for code completion, llama models for chat.

Does it work at all? What we need to fix to make it better?

@pardeep-singh
Copy link

@olegklimov I would like to take this up. Can you please some docs/example of how this can done? Do we need to test the integration here or make changes as well to make it work?

@olegklimov
Copy link
Contributor Author

Oh, here https://docs.refact.ai/byok/ you can test if we have documentation that is any good :D

@avie66
Copy link
Collaborator

avie66 commented Oct 21, 2024

Hi @pardeep-singh
Did you had a look over Oleg's approach?

@ukrolelo
Copy link

ukrolelo commented Feb 5, 2025

chat_endpoint: "http://localhost:11434/v1/chat/completions"
chat_model: "starcoder2:latest"

mainThreadExtensionService.ts:79 Error: write after end
	at _write (node:internal/streams/writable:489:11)
	at Writable.write (node:internal/streams/writable:510:10)
	at c:\Users\ukro\.vscode\extensions\smallcloud.codify-6.0.4-win32-x64\node_modules\vscode-jsonrpc\lib\node\ril.js:90:29
	at new Promise (<anonymous>)
	at WritableStreamWrapper.write (c:\Users\ukro\.vscode\extensions\smallcloud.codify-6.0.4-win32-x64\node_modules\vscode-jsonrpc\lib\node\ril.js:80:16)
	at StreamMessageWriter.doWrite (c:\Users\ukro\.vscode\extensions\smallcloud.codify-6.0.4-win32-x64\node_modules\vscode-jsonrpc\lib\common\messageWriter.js:100:33)
	at c:\Users\ukro\.vscode\extensions\smallcloud.codify-6.0.4-win32-x64\node_modules\vscode-jsonrpc\lib\common\messageWriter.js:91:29
$onExtensionRuntimeError @ mainThreadExtensionService.ts:79
127.0.0.1:9084/v1/at-command-preview:1 
        
        
       Failed to load resource: the server responded with a status of 417 (Expectation Failed)
127.0.0.1:9084/v1/chat:1 
        
        
       Failed to load resource: the server responded with a status of 400 (Bad Request)
127.0.0.1:9084/v1/at-command-preview:1 
        
        
       Failed to load resource: the server responded with a status of 417 (Expectation Failed)
127.0.0.1:9084/v1/at-command-preview:1 
        
        
       Failed to load resource: the server responded with a status of 417 (Expectation Failed)
127.0.0.1:9084/v1/at-command-preview:1 
        
        
       Failed to load resource: the server responded with a status of 417 (Expectation Failed)
127.0.0.1:9084/v1/chat:1 
        
        
       Failed to load resource: the server responded with a status of 400 (Bad Request)

@ukrolelo
Copy link

ukrolelo commented Feb 5, 2025

chat_endpoint: "http://localhost:11434/v1/chat/completions"
chat_model: "llama3.2:1b-instruct-q8_0"

Error: Bad Request
Click to retry

 POST http://127.0.0.1:9099/v1/at-command-preview 417 (Expectation Failed)
index.umd.cjs:1 
 POST http://127.0.0.1:9099/v1/at-command-preview 417 (Expectation Failed)
index.umd.cjs:1 
 POST http://127.0.0.1:9099/v1/at-command-preview 417 (Expectation Failed)
index.umd.cjs:1 
 POST http://127.0.0.1:9099/v1/at-command-preview 417 (Expectation Failed)
index.umd.cjs:1 
 POST http://127.0.0.1:9099/v1/at-command-preview 417 (Expectation Failed)
index.umd.cjs:1 
 POST http://127.0.0.1:9099/v1/at-command-preview 417 (Expectation Failed)
Show 25 more frames
index.umd.cjs:1 
 POST http://127.0.0.1:9099/v1/at-command-preview 417 (Expectation Failed)
Show 25 more frames
index.umd.cjs:17 
 POST http://127.0.0.1:9099/v1/chat 400 (Bad Request)



Why it is sending to different port?

@olegklimov
Copy link
Contributor Author

Why it is sending to different port?

VSCode extension talks to refact-lsp, that in turn talks to the inference server.

I think the fastest way we can fix this -- is reproduce your setup. So you have, windows, ollama with llama3.2:1b-instruct-q8_0 right, we'll try it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants