Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
NLLB support has been unclear since switching from OpenNMT to Eole.
This PR facilitates the conversion of official HF NLLB models (e.g. https://huggingface.co/facebook/nllb-200-distilled-1.3B).
This should also facilitate re-enabling support of pre-trained seq2seq models such as BART or T5.
That being said, the current structure of convert_HF needs to be reviewed to better support the encoder/decoder duality.
#156 was a first major step in making convert_HF, more modular, and #153 introduced the support of encoder keys, but now we need to meld all this into a more robust logic.
Also, we should probably define a better "HF settings deduction waterfall" in
build_config_dict
, but I'm not sure there is any centralized repository of all the possible values, as they are defined per model.Note:
HF tokenization was not tested yet, there might be some edge case to tackle with the prefix transform.EDIT: HF tokenization patched in 3fd59e6