Skip to content

Latest commit

 

History

History
18 lines (7 loc) · 875 Bytes

NIMs_deployment.md

File metadata and controls

18 lines (7 loc) · 875 Bytes

Deployment using NVIDIA NIM

NVIDIA NIM is a set of easy-to-use microservices designed to accelerate the deployment of generative AI models across the cloud, data center, and workstations. NIM’s are categorized by model family and a per model basis. For example, NVIDIA NIM for large language models (LLMs) brings the power of state-of-the-art LLMs to enterprise applications, providing unmatched natural language processing and understanding capabilities.

The LoRA adapters generated by the RTX AI Toolkit model customization workflow can be easily deployed as NIM customization.

Pre-requisites

https://docs.nvidia.com/nim/large-language-models/latest/getting-started.html

NIM Deployment

https://docs.nvidia.com/nim/large-language-models/latest/peft.html#lora-setup-overview