Skip to content

libowen2121/VI-dependency-syntax

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code for 'Dependency Grammar Induction with a Neural Variational Transition-based Parser' (AAAI2019)

Preprocessing:

Brown Clustering
After clustering, add extra two fields (cluster index and token index inside the cluster) to the UD/WSJ dataset
Customized TorchText 0.2.3

Since WSJ corpus is not publicly available, training and evaluating scripts for UD are as below.

Supervised training (for UD)

Train the encoder ./ud_scripts/ud_train_encoder.sh
Train the decoder ./ud_scripts/ud_train_decoder.sh
Note:
    Set no length limitation for preprocessing to keep a full vocabulary;
    Set random seed to be -1

Weakly-/Un-supervised training (for UD)

Rule settings:
    Universal Ruels: --pr_fname "./data/pr_rules/ud_c/"$LANGUAGE"_0.5.txt"
    Weakly Supervised: --pr_fname "./data/pr_rules/ud_c/"$LANGUAGE"_10_gt.txt"

Pretrain: cd ud_scripts && ./ud_pre.sh
Finetune: cd ud_scripts && ./ud_ft.sh

Evaluation (for UD)

cd ud_scripts && ./ud_test.sh

About

Dependency Grammar Induction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published