TGI Client

This is a utility to make it easier to run batched inference with a TGI server

Usage

Run python ./tgi-client/runner.py --input inputs.jsonl --output outputs.jsonl --model <HF Model ID> --endpoint http://127.0.0.1:8080 to batch process the prompts in inputs.jsonl. Prompts should not include the model preamble/template. Each line should be a dict like {"prompt": "What is the meaning of life?"}. The runner will ping the TGI instance until it's awake, then start working through the prompts.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
tgi-client		tgi-client
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TGI Client

Usage

About

Releases

Packages

Languages

tomhosking/tgi-client

Folders and files

Latest commit

History

Repository files navigation

TGI Client

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages