forked from mlc-ai/mlc-llm
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Serving][Refactor] Major codebase refactor
This PR is a major refactor of the serving framework. It contains the following aspects: * Changing Model and Sampler from `runtime.Module` to Object in TVM. Exposing the public interface through public member function (rather than through `GetFunction` which returns PackedFunc). Separating the definition and implementation of Model/Sampler classes. Now a base Model/Sampler class definition is in their respective header files, and the implementations are in .cc files. * Removing the TokenizerModule class, and directly using the Tokenizer class in `tokenizer_cpp`, to reduce indirection. * Introducing unique string `id` to Request. This id is passed in from frontend as a Request constructor parameter, and is the unique identifier of a request. * Reducing the uses of `ShapeTuple` after the Model/Sampler/Tokenizer interface changes. * Moving some previous member functions in Engine (such as "getting data length", "getting data embedding") to the corresponding data structure side. * Introducing struct `EngineStats` to contain all the runtime statistics of the engine. * Classifying and reordering the member functions in Engine.
- Loading branch information
1 parent
d609545
commit c0d233d
Showing
19 changed files
with
1,132 additions
and
1,262 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.