-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Followup PR for adding generation-server #339
Conversation
super, do you want to keep this one open and continue pushing fixes into it? or merge if this feels complete in a bit and then iterate if more is needed? |
I would like to keep this open. At least till the above 3 things are fixed. Maybe we can add newer things in subsequent PRs. |
Sounds good to me, @mayank31398 I'm also tweaking all 3 scripts to support all the different options: #340 while running many benchmarks, including int8 support. |
We are working on a blog huggingface/blog#432 and I'd like to have really simple one-script contains all demos for it. That's what the scripts for. Easy quick read w/o multiple imports. Your abstraction re-work is fantastic for production and extension, but doesn't lend for quick understanding. So in my mind both sets are needed as they serve different needs.
Honestly, I'm not quite sure. We can move them to the research projects folder under wrt blog - I'm not yet sure if it's best to have 2 separate posts - one for scripts and another for server, as Nicolas has developed a whole other solution using Rust (see the blog). Perhaps your and his solutions would go well together as a post dedicated to the server solutions - i.e. you would co-author it. I'm just brainstorming here, so please make suggestions / throw ideas and let's land these somewhere where they would be most useful to the user. But I also need to move on with other projects so I would like to finish this in the next few days if possible. (I mean the scripts, please take your time with the server code as much as you need - no rush) |
Yeah, I think I understand :) |
Ready for merge @stas00 |
…n-DeepSpeed into mayank/generation-server
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to merge, @mayank31398
@mayank31398, FYI I'm in the process of moving these 2 code sets to a dedicated repo: https://github.com/huggingface/transformers-bloom-inference/ as these have nothing to do with this current repo. All moved. The problem is that it appears that your github user setup is somewhat broken, as you can see e.g. here and here: github only shows your name and no link to your username. I think it's because of your email setup when you commit - github doesn't know your email address and can't associate it with your github username. I tried to import your code while giving you full credits with:
and ended up with the same issue: huggingface/transformers-bloom-inference@b8a64d9 your name is there, but no link to your username in the commit description. I took the info from your commits. If you don't care it's fine with me. But if you fix this then we can fix the commit so that it's clear you authored that code. Otherwise currently anybody reading your contribution is unlikely to find you... I will shortly add you to the repo so that you could push there directly. If you have further questions please don't hesitate to ask. I also edited your section's README to add a support sub-section - please feel free to further edit it to your liking. |
I have fixed the username problem @stas00 ^^ |
I didn't even realize my git was setup incorrectly. |
Thanks for modifying |
I'm not sure I did it in the cleanest way, but it seemed to work. You'd need to I will try a different approach in the future as this is definitely not the best way for a repo with many devs/users. https://stackoverflow.com/questions/3042437/how-to-change-the-commit-author-for-a-single-commit towards the end there are simpler solutions that amend the author globally |
* fix grpc * use functools.partial * fix ds-inference server * add support for int8 * update README * fix bugs Co-authored-by: Mayank Mishra <[email protected]>
@stas00
This will fix:
Let me know if more functionality is desired.
Followup for #328