Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Rust-based openai api proxy server in node hub #678

Merged
merged 4 commits into from
Oct 7, 2024

Conversation

phil-opp
Copy link
Collaborator

@phil-opp phil-opp commented Oct 7, 2024

This is a Rust-based version of #676 based on https://github.com/LlamaEdge/LlamaEdge/tree/main/llama-api-server . It includes serde structs to fully deserialize chat completion requests.

The server replies to requests asynchronously, which makes it possible to serve multiple clients at the same time. Note that replies are mapped to requests purely based on their order right now. So the first reply is assigned to the first request and so on. If the order of replies is different (e.g. because some completions take longer), some form of ID needs to be added to the code to enable a proper request<->reply mapping.

Like #676, this PR only implements basic chat completion requests. You can try it through these steps:

cd examples/openai-server
dora build dataflow-rust.yml
dora start dataflow-rust.yml

# In a separate terminal
python openai_api_client.py

I didn't implement the /v1/models endpoint, so the "Testing API endpoints..." part of openai_api_client.py is expected to fail with "Error listing models: 404".

@phil-opp
Copy link
Collaborator Author

phil-opp commented Oct 7, 2024

We discussed that we want to merge this directly to enable preliminary testing.

@phil-opp phil-opp merged commit c8a890a into main Oct 7, 2024
40 checks passed
@phil-opp phil-opp deleted the openai-api-proxy branch October 7, 2024 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant