-
Notifications
You must be signed in to change notification settings - Fork 230
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Integrate distributed inference with chat/server (#1381)
* Integrate distributed inference without introducing abstraction * Cleanup old distributed inference integration * Read distribution from model_config * Declare distribution_path if args.model is not given * Address some nits from PR review * Added comment on model size all reduce + type hint * Apply suggestions from code review Co-authored-by: Jack-Khuu <[email protected]> * Make sure speculative decoding is disable for pp >1 and remark this in the comments as well * Refactor conditions in pp * Rename and alter signature of setup_env to reflect that it also runs the target * Rename setup_env in server + fix condition * Update generate.py * Add default value to add_generation_prompt to preserve bc --------- Co-authored-by: Jack-Khuu <[email protected]>
- Loading branch information
Showing
8 changed files
with
596 additions
and
956 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.