Skip to content

Commit

Permalink
Add end_strings to SamplingParams (#6986)
Browse files Browse the repository at this point in the history
* Add end_strings to SamplingParams

Signed-off-by: Gerald Shen <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Gerald Shen <[email protected]>

* Add end_strings to megatron_gpt_inference.yaml

Signed-off-by: Gerald Shen <[email protected]>

* Add end_strings to sampling params

Signed-off-by: Gerald Shen <[email protected]>

* Remove extra_id_1 from default end_strings

Signed-off-by: Gerald Shen <[email protected]>

* Fix require_grad typos (#6930)

Signed-off-by: Sergii Dymchenko <[email protected]>
Signed-off-by: Gerald Shen <[email protected]>

* fix syntax error

Signed-off-by: Gerald Shen <[email protected]>

* fix the mpt chatbot (#6957) (#6968)

Signed-off-by: Yi Dong <[email protected]>
Co-authored-by: Yi Dong <[email protected]>
Signed-off-by: Gerald Shen <[email protected]>

* add support for max_total_length=4096 for 43b (#6763)

* add support for max_total_length=4096 for 43b

Signed-off-by: Zhilin Wang <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Zhilin Wang <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Gerald Shen <[email protected]>

* rnnt_greedy_decoding.py: typos? auto-repressively -> auto-regressively (#6989)

Signed-off-by: Vadim Kantorov <[email protected]>
Signed-off-by: Gerald Shen <[email protected]>

* Cache handling without input tensors mutation (#6980) (#6996)

* Cache handling without input tensors mutation



* Cleanup



* Cleanup#2



* Cleanup#3



---------

Signed-off-by: Boris Fomitchev <[email protected]>
Co-authored-by: Boris Fomitchev <[email protected]>
Co-authored-by: Somshubra Majumdar <[email protected]>
Signed-off-by: Gerald Shen <[email protected]>

* Hybrid conformer export (#6983) (#6995)

* Implemented generic kv-pair setting of export_config from args



* Hybrid conformer export



* Hybrid decoder export



* Cleanup



* Changed from **kwargs



* Docstring



* Docs added



* Stringify args



* Added docs for ASR export configs



* lowercase ctc



---------

Signed-off-by: Boris Fomitchev <[email protected]>
Co-authored-by: Boris Fomitchev <[email protected]>
Signed-off-by: Gerald Shen <[email protected]>

* Fixing an issue with confidence ensembles (#6987) (#7004)

* Bug fix for the confidence ensembles



* Relax constraints for the test



---------

Signed-off-by: Igor Gitman <[email protected]>
Co-authored-by: Igor Gitman <[email protected]>
Signed-off-by: Gerald Shen <[email protected]>

* [TTS] Add cosine distance option to TTS aligner (#6806)

* [TTS] Add cosine distance option to TTS aligner

Signed-off-by: Ryan <[email protected]>

* [TTS] Update aligner comments

Signed-off-by: Ryan <[email protected]>

---------

Signed-off-by: Ryan <[email protected]>
Signed-off-by: Gerald Shen <[email protected]>

* Minor MPT-7B fixes and creation script update (#6982)

* Initial commit of minor MPT-7B fixes

Signed-off-by: Daniel Egert <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Daniel Egert <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Gerald Shen <[email protected]>

* Change Jenkins timeout (#6997)

* change timeout

Signed-off-by: ericharper <[email protected]>

* change to 8 hours

Signed-off-by: ericharper <[email protected]>

---------

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Gerald Shen <[email protected]>

* remove hard coded input and output fields (#7008)

* remove hard coded input and output fields

Signed-off-by: arendu <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: arendu <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Gerald Shen <[email protected]>

* RoPE length extrapolation with interpolation (#7005)

* Push changes

Signed-off-by: MaximumEntropy <[email protected]>

* Fixes

Signed-off-by: MaximumEntropy <[email protected]>

* add continue training script

Signed-off-by: MaximumEntropy <[email protected]>

* [WIP] nonlinear interp

Signed-off-by: MaximumEntropy <[email protected]>

* Fix

Signed-off-by: MaximumEntropy <[email protected]>

* override encoder_seq_len

Signed-off-by: MaximumEntropy <[email protected]>

* Remove nonlinear

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* sft with pi (#7006)

* sft with pi

Signed-off-by: Evelina <[email protected]>

* update values only if not None"

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>

* Address comments

Signed-off-by: MaximumEntropy <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add info

Signed-off-by: MaximumEntropy <[email protected]>

* Empty

Signed-off-by: MaximumEntropy <[email protected]>

---------

Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Gerald Shen <[email protected]>

* use proper config

Signed-off-by: Gerald Shen <[email protected]>

* Add end_strings to SamplingParams

Signed-off-by: Gerald Shen <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Gerald Shen <[email protected]>

* Add end_strings to megatron_gpt_inference.yaml

Signed-off-by: Gerald Shen <[email protected]>

* Add end_strings to sampling params

Signed-off-by: Gerald Shen <[email protected]>

* Remove extra_id_1 from default end_strings

Signed-off-by: Gerald Shen <[email protected]>

* fix syntax error

Signed-off-by: Gerald Shen <[email protected]>

* use proper config

Signed-off-by: Gerald Shen <[email protected]>

---------

Signed-off-by: Gerald Shen <[email protected]>
Signed-off-by: Sergii Dymchenko <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Zhilin Wang <[email protected]>
Signed-off-by: Vadim Kantorov <[email protected]>
Signed-off-by: Boris Fomitchev <[email protected]>
Signed-off-by: Igor Gitman <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Daniel Egert <[email protected]>
Signed-off-by: ericharper <[email protected]>
Signed-off-by: arendu <[email protected]>
Signed-off-by: MaximumEntropy <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sergii Dymchenko <[email protected]>
Co-authored-by: Gerald Shen <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Yi Dong <[email protected]>
Co-authored-by: Zhilin Wang <[email protected]>
Co-authored-by: Vadim Kantorov <[email protected]>
Co-authored-by: Boris Fomitchev <[email protected]>
Co-authored-by: Somshubra Majumdar <[email protected]>
Co-authored-by: Igor Gitman <[email protected]>
Co-authored-by: Ryan Langman <[email protected]>
Co-authored-by: trias702 <[email protected]>
Co-authored-by: Eric Harper <[email protected]>
Co-authored-by: Adi Renduchintala <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Co-authored-by: Evelina <[email protected]>
  • Loading branch information
17 people authored Jul 13, 2023
1 parent d44127e commit f7e33fc
Show file tree
Hide file tree
Showing 8 changed files with 23 additions and 13 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ inference:
repetition_penalty: 1.2 # The parameter for repetition penalty. 1.0 means no penalty.
min_tokens_to_generate: 0 # The minimum length of the sequence to be generated.
compute_logprob: False # a flag used to compute logprob of all the input text, a very special case of running inference, default False

end_strings: ["<|endoftext|>"] # generation will stop when one of these tokens is generated

trainer:
devices: 1
Expand Down
1 change: 1 addition & 0 deletions examples/nlp/language_modeling/megatron_gpt_eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,7 @@ def main(cfg) -> None:
"add_BOS": cfg.inference.add_BOS,
"all_probs": cfg.inference.all_probs,
"compute_logprob": cfg.inference.compute_logprob,
"end_strings": cfg.inference.end_strings,
}

fp8_enabled = hasattr(model.cfg, "fp8") and (model.cfg.fp8 == True)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ def init_model(self, cfg: DictConfig, trainer: Trainer):
"add_BOS": True,
"all_probs": False,
"compute_logprob": False,
"end_strings": self.cfg.inference.get('end_strings', ["<|endoftext|>"]),
}
elif self.cfg.get("report_validation_metric", False) and not hasattr(self.cfg, 'inference'):
raise ValueError("Must provide inference parameters for reporting validation metric!")
Expand Down Expand Up @@ -754,6 +755,7 @@ def predict_step(self, batch: Any, batch_idx: int, dataloader_idx: Optional[int]
"all_probs": inference_config["all_probs"],
"compute_logprob": inference_config["compute_logprob"],
"compute_attention_mask": inference_config.get("compute_attention_mask", True),
"end_strings": inference_config.get('end_strings', ["<|endoftext|>"]),
}

task_ids, processed_inputs = batch
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -390,6 +390,7 @@ def inference_step(self, dataloader_iter, batch_idx, mode, dataloader_idx=0):
"add_BOS": False,
"all_probs": False,
"compute_logprob": False,
"end_strings": ["<|endoftext|>"],
}
result = megatron_gpt_generate(
model=self,
Expand Down
18 changes: 9 additions & 9 deletions nemo/collections/nlp/modules/common/text_generation_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,14 @@ def put(self):
if not (1.0 <= repetition_penalty):
return "repetition_penalty must be a positive number no less than 1.0"

end_strings = ['<|endoftext|>']
if 'end_strings' in request.get_json():
end_strings = request.get_json()['end_strings']
if not isinstance(end_strings, list):
return "expect end_strings to be a list of strings"
if not all([isinstance(s, str) for s in end_strings]):
return "expect end_strings to be a list of strings"

min_tokens_to_generate = 0
if "min_tokens_to_generate" in request.get_json():
min_tokens_to_generate = request.get_json()["min_tokens_to_generate"]
Expand All @@ -157,14 +165,6 @@ def put(self):
if neighbors < 0:
return "num of neighbors must be an integer no less than 0"

end_strings = ['<|endoftext|>']
if 'end_strings' in request.get_json():
end_strings = request.get_json()['end_strings']
if not isinstance(end_strings, list):
return "expect end_strings to be a list of strings"
if not all([isinstance(s, str) for s in end_strings]):
return "expect end_strings to be a list of strings"

with lock: # Need to get lock to keep multiple threads from hitting code
MegatronGenerate.send_do_generate() # Tell other ranks we're doing generate
extra = {}
Expand All @@ -190,8 +190,8 @@ def put(self):
top_p,
greedy,
repetition_penalty,
min_tokens_to_generate,
end_strings=end_strings,
min_tokens_to_generate=min_tokens_to_generate,
**extra,
)
for k in output:
Expand Down
9 changes: 6 additions & 3 deletions nemo/collections/nlp/modules/common/text_generation_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ def get_default_sampling_params():
"add_BOS": True,
"all_probs": False,
"compute_logprob": False,
"end_strings": ["<|endoftext|>", "<extra_id_1>"],
}

return sampling_params
Expand Down Expand Up @@ -104,6 +105,7 @@ def megatron_gpt_generate(model, inputs, tokenizer, length_params, sampling_para
top_p=sampling_params['top_p'],
greedy=sampling_params['use_greedy'],
repetition_penalty=sampling_params['repetition_penalty'],
end_strings=sampling_params['end_strings'],
min_tokens_to_generate=length_params['min_length'],
compute_attention_mask=sampling_params.get("compute_attention_mask", True),
**strategy_args,
Expand All @@ -125,6 +127,7 @@ def megatron_gpt_generate(model, inputs, tokenizer, length_params, sampling_para
top_p=sampling_params['top_p'],
greedy=sampling_params['use_greedy'],
repetition_penalty=sampling_params['repetition_penalty'],
end_strings=sampling_params['end_strings'],
min_tokens_to_generate=length_params['min_length'],
**strategy_args,
)
Expand Down Expand Up @@ -380,8 +383,8 @@ def synced_generate(
compute_attention_mask=True,
compute_logprob=False,
repetition_penalty=1.2,
min_tokens_to_generate=0,
end_strings=[],
min_tokens_to_generate=0,
):
context_length = context_length_tensor.min().item()
tokenizer = model.tokenizer
Expand Down Expand Up @@ -475,8 +478,8 @@ def generate(
compute_attention_mask=True,
compute_logprob=False,
repetition_penalty=1.0,
min_tokens_to_generate=0,
end_strings=['<|endoftext|>'],
min_tokens_to_generate=0,
**strategy_args,
) -> OutputType:
"""
Expand Down Expand Up @@ -560,8 +563,8 @@ def generate(
top_p=top_p,
greedy=greedy,
repetition_penalty=repetition_penalty,
min_tokens_to_generate=min_tokens_to_generate,
end_strings=end_strings,
min_tokens_to_generate=min_tokens_to_generate,
)
special_tokens = set()
if hasattr(tokenizer, 'pad_token') and tokenizer.pad_token is not None:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ class SamplingParam(TypedDict):
add_BOS: bool # add the bos token at the begining of the prompt
all_probs: bool # whether return the log prob for all the tokens in vocab
compute_logprob: bool # a flag used to compute logprob of all the input text, a very special case of running inference, default False
end_strings: List[str] # generation will stop when one of these tokens is generated


class OutputType(TypedDict):
Expand Down Expand Up @@ -88,6 +89,7 @@ def generate(
add_BOS: bool, Whether add the bos token at the begining of the prompt
all_probs: bool # whether return the log prob for all the tokens in vocab
compute_logprob: bool # a flag used to compute logprob of all the input text, a very special case of running inference, default False
end_strings: List[str] # generation will stop when one of these tokens is generated
Default None, If it is None, use_greedy will be "True".
Returns:
OutputType: It generates the output in a dictionary type. It has the following keys:
Expand Down
1 change: 1 addition & 0 deletions tests/collections/nlp/test_gpt_eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ def test_gpt_eval(self):
"add_BOS": True,
"all_probs": False,
"compute_logprob": False,
"end_strings": ["<|endoftext|>"],
}

# test logprob
Expand Down

0 comments on commit f7e33fc

Please sign in to comment.