Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] README typo fix and readability improvements #118

Merged
merged 1 commit into from
May 6, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Also check out [H2O LLM Studio](https://github.com/h2oai/h2o-llmstudio) for our
- Integration of code and resulting LLMs with downstream applications and low/no-code platforms
- Complement h2oGPT chatbot with search and other APIs
- High-performance distributed training of larger models on trillion tokens
- Improve code completion, reasoning, mathematics, factual correctness, hallucinations and avoid repetitions
- Enhance the model's code completion, reasoning, and mathematical capabilities, ensure factual correctness, minimize hallucinations, and avoid repetitive output

### Chat with h2oGPT

Expand All @@ -52,7 +52,7 @@ You can also use [Docker](INSTALL-DOCKER.md#containerized-installation-for-infer

#### Larger models require more GPU memory

Depending on available GPU memory, you can load differently sized models. For multiple GPUs, automatic sharding can be enabled with `--infer_devices=False`, but that is disabled by default since cuda:x cuda:y mismatches can occur.
Depending on available GPU memory, you can load differently sized models. For multiple GPUs, automatic sharding can be enabled with `--infer_devices=False`, but this is disabled by default since cuda:x cuda:y mismatches can occur.

For GPUs with at least 24GB of memory, we recommend:
```bash
Expand All @@ -62,15 +62,15 @@ For GPUs with at least 48GB of memory, we recommend:
```bash
python generate.py --base_model=h2oai/h2ogpt-oasst1-512-20b
```
The number `512` in the model names indicate the cutoff lengths (in tokens) used for fine-tuning. Shorter values generally result in faster training and more focus on the last part of the provided input text (consisting of prompt and answer).
The number `512` in the model names indicates the cutoff lengths (in tokens) used for fine-tuning. Shorter values generally result in faster training and more focus on the last part of the provided input text (consisting of prompt and answer).

More information about the models can be found on [H2O.ai's Hugging Face page](https://huggingface.co/h2oai/).

### Development

- Follow the [installation instructions](INSTALL.md) to create a development environment for training and generation.
- Follow the [fine-tuning instructions](FINETUNE.md) to fine-tune any LLM models on your data.
- Follow the [Docker instructions](INSTALL-DOCKER.md) to create a container for deployment.
- To create a development environment for training and generation, follow the [installation instructions](INSTALL.md).
- To fine-tune any LLM models on your data, follow the [fine-tuning instructions](FINETUNE.md).
- To create a container for deployment, follow the [Docker instructions](INSTALL-DOCKER.md).

### Help

Expand Down