Running Llama v2 with Llama.cpp in a 4GB VRAM GTX 1650.
To extend your Nvidia GPU resource and drivers to a docker container.
You need to install NVIDA CUDA Container Toolkit
N_GPU_LAYERS=35
N_BATCH=4096
N_THREADS=4
gradio+llama_cpp-streaming.mp4
docker compose build
docker compose down && docker compose up -d
Visit: http://localhost:7861/
to access the Gradio Chatbot UI.
Pre-commit is already part of this project dependencies. If you would like to installed it as standalone run:
pip install pre-commit
To activate pre-commit run the following commands:
- Install Git hooks:
pre-commit install
- Update current hooks:
pre-commit autoupdate
To test your installation of pre-commit run:
pre-commit run --all-files