Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provider needs to pass the GPU model to the bid price script _before_ it Bids so it can price it (when the GPU model is not set in SDL) #166

Closed
andy108369 opened this issue Dec 18, 2023 · 3 comments · Fixed by akash-network/provider#242 or akash-network/helm-charts#280
Assignees
Labels
repo/provider Akash provider-services repo issues sev2

Comments

@andy108369
Copy link
Contributor

andy108369 commented Dec 18, 2023

akash node/network 0.30.0, and 0.32.2
provider-services version 0.4.8, and 0.5.13

Update (Apr 30, 2024):
TL;DR The issue is basically this:

  • model: * set in the SDL eventually gets through the bid engine which then gets passed to the bid price script (or whatever reads the stdin to process the resources); What's expected is that the actually available GPU which got selected by the K8s engine.

The current version of the bid price script automatically sets the highest price when a specific GPU is not designated in the SDL. This approach is adopted because the script is unable to determine which GPU is being requested or which one is available; therefore, it defaults to the highest price. This serves as a temporary solution to ensure that the provider does not inadvertently offer a high-end GPU at the cost of a lower-end model.
However, it causes an issue when only the lower-end models are available and the client does not explicitly specify the model in the SDL - this way he gets lower-end models for the highest price of the highest-end model.

Provider needs to pass the GPU model it picked to the bid price script before it Bids so it can price it (when the GPU model is not set in SDL deployment manifest).

@andy108369 andy108369 added repo/provider Akash provider-services repo issues awaiting-triage labels Dec 18, 2023
@chainzero chainzero added the sev2 label Dec 20, 2023
@andy108369
Copy link
Contributor Author

the best provider to test this now is provider.medc1.com (akash1ffpcy473xqs37yvv4jhhh2hsuv786nzs4xt0dj) which got t4, a100 and rtx4090 (a40 probably soon)

@anilmurty
Copy link

Thanks Andrey - I ran into this yesterday and wasn't aware of this issue - so, appreciated! - going to addd this to backlog

@anilmurty anilmurty moved this to Backlog (not prioritized) in Core Product and Engineering Roadmap Apr 30, 2024
@brewsterdrinkwater brewsterdrinkwater moved this from Backlog (not prioritized) to Up Next (prioritized) in Core Product and Engineering Roadmap Apr 30, 2024
@andrewhare andrewhare moved this from Up Next (prioritized) to In Progress (prioritized) in Core Product and Engineering Roadmap May 6, 2024
@brewsterdrinkwater
Copy link
Collaborator

May 14th, 2024:

  • In internal testing with core team

@brewsterdrinkwater brewsterdrinkwater moved this from In Progress (prioritized) to In Test (or staging) in Core Product and Engineering Roadmap May 14, 2024
@github-project-automation github-project-automation bot moved this from In Test (or staging) to Released (in Prod) in Core Product and Engineering Roadmap May 15, 2024
andy108369 added a commit to andy108369/helm-charts that referenced this issue May 15, 2024
andy108369 added a commit to akash-network/helm-charts that referenced this issue May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
repo/provider Akash provider-services repo issues sev2
Projects
Status: Released (in Prod)
6 participants