Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add new inference paradiam #473

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tangming1996
Copy link
Collaborator

What type of PR is this?
/kind documentation

What this PR does / why we need it:
The current joint inference paradigm can only support the traditional discriminative model and cannot support the large language model (LLM). This proposal aims to introduce a new inference paradigm to support the joint inference of LLM
Which issue(s) this PR fixes:

Fixes #

@kubeedge-bot kubeedge-bot added the kind/documentation Categorizes issue or PR as related to documentation. label Feb 25, 2025
@kubeedge-bot kubeedge-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Feb 25, 2025
@tangming1996 tangming1996 force-pushed the docs/inference-enhance branch 2 times, most recently from 273fcd9 to 03e0ca7 Compare February 25, 2025 09:19
@kubeedge-bot kubeedge-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 25, 2025
@tangming1996 tangming1996 force-pushed the docs/inference-enhance branch from 03e0ca7 to f48d1b1 Compare February 26, 2025 05:46
@kubeedge-bot kubeedge-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 26, 2025
@tangming1996 tangming1996 force-pushed the docs/inference-enhance branch 4 times, most recently from e0641e2 to 5c6fd44 Compare February 27, 2025 06:20
@tangming1996 tangming1996 force-pushed the docs/inference-enhance branch 4 times, most recently from 2a45262 to 48b6adc Compare March 6, 2025 06:34
Copy link
Collaborator

@MooreZheng MooreZheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this Era, it is critical and urgent to introduce LLM in Sedna. The new paradigm shown in this proposal is thus important.

As discussed, since the Sedna lib is removed from the paradigm, we need several serious discussions on this proposal.

  1. Sedna and Ianvs paradigms for LLM joint inference should be consistent - might discussed with @FuryMartin and @hsj576

  2. How will this new paradigm improve the integration of LLM models by removing Sedna lib in cloud and edge inference?

  • Will that bring concerns for the data interfaces? When switching to a new model, people need to switch the data format to the new, leading to an additional cost.
  • Will this paradigm support a non-LLM model on the edge? It seems that this paradigm only works in the case where both cloud and edge deploy LLMs, instead of cloud-only LLM and edge-only LLM, as claimed in the proposal.
  1. How will this new paradigm improve the integration of hard example mining by removing Sedna lib in hard example mining (sider car)?
  • What is the hard example mining algorithm used now in the new paradigm? How to switch hard example mining algorithms?
  1. With the above consideration, we wonder whether this paradigm fully supports LLM joint inference. The new paradigm seems more like a new paradigm of LLM single-task learning, doesn't it? That is, a trained LLM is deployed either on the edge or on the cloud.

@tangming1996 tangming1996 force-pushed the docs/inference-enhance branch from 48b6adc to 1fc1629 Compare March 7, 2025 01:36
@kubeedge-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign moorezheng after the PR has been reviewed.
You can assign the PR to them by writing /assign @moorezheng in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants