Deployment using NVIDIA NIM

NVIDIA NIM is a set of easy-to-use microservices designed to accelerate the deployment of generative AI models across the cloud, data center, and workstations. NIM’s are categorized by model family and a per model basis. For example, NVIDIA NIM for large language models (LLMs) brings the power of state-of-the-art LLMs to enterprise applications, providing unmatched natural language processing and understanding capabilities.

The LoRA adapters generated by the RTX AI Toolkit model customization workflow can be easily deployed as NIM customization.

Pre-requisites

https://docs.nvidia.com/nim/large-language-models/latest/getting-started.html

NIM Deployment

https://docs.nvidia.com/nim/large-language-models/latest/peft.html#lora-setup-overview

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NIMs_deployment.md

NIMs_deployment.md

Deployment using NVIDIA NIM

Pre-requisites

NIM Deployment

Files

NIMs_deployment.md

Latest commit

History

NIMs_deployment.md

File metadata and controls

Deployment using NVIDIA NIM

Pre-requisites

NIM Deployment