Skip to content
This repository has been archived by the owner on Jan 5, 2023. It is now read-only.

v1.1

Compare
Choose a tag to compare
@ozancaglayan ozancaglayan released this 25 Jan 11:28
· 696 commits to master since this release
7891452

v1.1 (25/01/2018)

  • New experimental Multi30kDataset and ImageFolderDataset classes
  • torchvision dependency added for CNN support
  • nmtpy-coco-metrics now computes one METEOR without norm=True
  • Mainloop mechanism is completely refactored with backward-incompatible
    configuration option changes for [train] section:
    • patience_delta option is removed
    • Added eval_batch_size to define batch size for GPU beam-search during training
    • eval_freq default is now 3000 which means per 3000 minibatches
    • eval_metrics now defaults to loss. As before, you can provide a list
      of metrics like bleu,meteor,loss to compute all of them and early-stop
      based on the first
    • Added eval_zero (default: False) which tells to evaluate the model
      once on dev set right before the training starts. Useful for sanity
      checking if you fine-tune a model initialized with pre-trained weights
    • Removed save_best_n: we no longer save the best N models on dev set
      w.r.t. early-stopping metric
    • Added save_best_metrics (default: True) which will save best models
      on dev set w.r.t each metric provided in eval_metrics. This kind of
      remedies the removal of save_best_n
    • checkpoint_freq now to defaults to 5000 which means per 5000
      minibatches.
    • Added n_checkpoints (default: 5) to define the number of last
      checkpoints that will be kept if checkpoint_freq > 0 i.e. checkpointing enabled
  • Added ExtendedInterpolation support to configuration files:
    • You can now define intermediate variables in .conf files to avoid
      typing same paths again and again. A variable can be referenced
      from within its section using tensorboard_dir: ${save_path}/tb notation
      Cross-section references are also possible: ${data:root} will be replaced
      by the value of the root variable defined in the [data] section.
  • Added -p/--pretrained to nmtpy train to initialize the weights of
    the model using another checkpoint .ckpt.
  • Improved input/output handling for nmtpy translate:
    • -s accepts a comma-separated test sets defined in the configuration
      file of the experiment to translate them at once. Example: -s val,newstest2016,newstest2017
    • The mutually exclusive counterpart of -s is -S which receives a
      single input file of source sentences.
    • For both cases, an output prefix should now be provided with -o.
      In the case of multiple test sets, the output prefix will be appended
      the name of the test set and the beam size. If you just provide a single file with -S
      the final output name will only reflect the beam size information.
  • Two new arguments for nmtpy-build-vocab:
    • -f: Stores frequency counts as well inside the final json vocabulary
    • -x: Does not add special markers <eos>,<bos>,<unk>,<pad> into the vocabulary

Layers/Architectures

  • Added Fusion() layer to concat,sum,mul an arbitrary number of inputs
  • Added experimental ImageEncoder() layer to seamlessly plug a VGG or ResNet
    CNN using torchvision pretrained models
  • Attention layer arguments improved. You can now select the bottleneck
    dimensionality for MLP attention with att_bottleneck. The dot
    attention is still not tested and probably broken.

New stuff

Changes in NMT

  • dec_init defaults to mean_ctx, i.e. the decoder will be initialized
    with the mean context computed from the source encoder
  • enc_lnorm which was just a placeholder is now removed since we do not
    provided layer-normalization for now
  • Beam Search is completely moved to GPU