Skip to content

Source Code for ACL 2024 main conference paper "Self-Modifying State Modeling for Simultaneous Machine Translation".

License

Notifications You must be signed in to change notification settings

EurekaForNLP/SM2

Repository files navigation

Self-Modifying State Modeling for Simultaneous Machine Translation

Source Code for ACL 2024 main conference paper "Self-Modifying State Modeling for Simultaneous Machine Translation".

Our model is implemented based on the open-source toolkit Fairseq and the open-source code ITST.

Requirements and Installation

  • Python >= 3.7.10

  • torch >= 1.13.0

  • sacrebleu = 1.5.0

  • Install the Fairseq with the following commands:

    git clone https://github.com/EurekaForNLP/SM2.git
    cd SM2
    pip install --editable ./

Quick Start

Data Processing

Training

Use train_sm2.sh to finish Training SM$^2$. It is noted that:

  • The --arch transformer_with_sm2_unidirectional for SM$^2$ with unidirectional encoder settings.
  • If the used device supports bf16, --bf16 is suggested.
  • If source and target language share embeddings, use --share-all-embeddings.

Inference

Use test_sm2.sh to finish the inference process of simultaneous translation with --batch-size=1 and --beam=1

About

Source Code for ACL 2024 main conference paper "Self-Modifying State Modeling for Simultaneous Machine Translation".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages