Support exporting Nemotron-340B for TensorRT-LLM #11015

jinyangyuan-nvidia · 2024-10-24T02:46:45Z

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [Note which collection this PR will affect]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

juney-nvidia · 2024-10-24T07:53:25Z

@oyilmaz-nvidia Hi Onur, can you help review this PR?
In TRT-LLM 0.14 release, we are working to prepare documentation for multi-node deployment and we want to use Nemotron as an example. Without this fix, we cannot run through Nemotron on multi-node.

meatybobby · 2024-10-28T17:10:08Z

It seems the changes in nemo/export/trt_llm/converter/model_converter.py is not necessary. I tested parallel_embedding still works without the changes. Could we remove the change on this file?

Signed-off-by: Jinyang Yuan <[email protected]>

nemo/export/trt_llm/converter/model_converter.py

Signed-off-by: Jinyang Yuan <[email protected]> Co-authored-by: Jinyang Yuan <[email protected]> Co-authored-by: meatybobby <[email protected]> Signed-off-by: Hainan Xu <[email protected]>

Signed-off-by: Jinyang Yuan <[email protected]> Co-authored-by: Jinyang Yuan <[email protected]> Co-authored-by: meatybobby <[email protected]>

jinyangyuan-nvidia force-pushed the fix_convert_nemotron_340b branch from 73d1a52 to 83299cc Compare October 24, 2024 02:48

oyilmaz-nvidia requested review from meatybobby and oyilmaz-nvidia October 24, 2024 18:36

jinyangyuan-nvidia force-pushed the fix_convert_nemotron_340b branch from cefc996 to 468c99e Compare October 29, 2024 15:01

jinyangyuan-nvidia requested review from pablo-garay and ko3n1g as code owners October 29, 2024 15:01

github-actions bot added core Changes to NeMo Core ASR NLP CI common Multi Modal audio labels Oct 29, 2024

jinyangyuan-nvidia force-pushed the fix_convert_nemotron_340b branch from 468c99e to 7905f4a Compare October 29, 2024 15:09

github-actions bot removed core Changes to NeMo Core ASR NLP CI common Multi Modal labels Oct 29, 2024

Support exporting Nemotron-340B for TensorRT-LLM

24355f8

Signed-off-by: Jinyang Yuan <[email protected]>

jinyangyuan-nvidia force-pushed the fix_convert_nemotron_340b branch from 7905f4a to 24355f8 Compare October 29, 2024 15:10

meatybobby added Run CICD and removed audio labels Oct 29, 2024

meatybobby approved these changes Oct 29, 2024

View reviewed changes

nemo/export/trt_llm/converter/model_converter.py Outdated Show resolved Hide resolved

meatybobby enabled auto-merge (squash) October 29, 2024 16:58

meatybobby removed the Run CICD label Oct 29, 2024

Merge branch 'main' into fix_convert_nemotron_340b

fd17af2

meatybobby added the Run CICD label Oct 29, 2024

oyilmaz-nvidia disabled auto-merge October 29, 2024 20:01

oyilmaz-nvidia enabled auto-merge (squash) October 29, 2024 20:01

Merge branch 'main' into fix_convert_nemotron_340b

2152507

meatybobby added Run CICD and removed Run CICD labels Oct 30, 2024

oyilmaz-nvidia merged commit bad4bfe into NVIDIA:main Oct 31, 2024
154 of 155 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support exporting Nemotron-340B for TensorRT-LLM #11015

Support exporting Nemotron-340B for TensorRT-LLM #11015

jinyangyuan-nvidia commented Oct 24, 2024

juney-nvidia commented Oct 24, 2024

meatybobby commented Oct 28, 2024

Support exporting Nemotron-340B for TensorRT-LLM #11015

Support exporting Nemotron-340B for TensorRT-LLM #11015

Conversation

jinyangyuan-nvidia commented Oct 24, 2024

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

juney-nvidia commented Oct 24, 2024

meatybobby commented Oct 28, 2024