Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add support for new mcore ds features (NVIDIA#9388)
* add validation_drop_last and add_extra_token params support for mcore ds Signed-off-by: dimapihtar <[email protected]> * pad samples with dummy tokens only Signed-off-by: dimapihtar <[email protected]> * Apply isort and black reformatting Signed-off-by: dimapihtar <[email protected]> * use no_seqlen_plus_one_input_tokens as mcore's add_extra_token Signed-off-by: dimapihtar <[email protected]> * revert config Signed-off-by: dimapihtar <[email protected]> * revert config Signed-off-by: dimapihtar <[email protected]> * set train_valid_test_num_samples[1] to None Signed-off-by: dimapihtar <[email protected]> * add test case when validation_drop_last is False Signed-off-by: dimapihtar <[email protected]> * revert config Signed-off-by: dimapihtar <[email protected]> * set validation_drop_last as True by default Signed-off-by: dimapihtar <[email protected]> * Update nemo/collections/nlp/data/language_modeling/megatron/data_samplers.py Co-authored-by: jbaczek <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> * Update nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py Co-authored-by: jbaczek <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> --------- Signed-off-by: dimapihtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Co-authored-by: jbaczek <[email protected]>
- Loading branch information