DFA extractor

Extracting DFAs from RNNs using learning and state merging techniques.

Git LFS

This project uses Git LFS to track .th files for models. See here for more info.

To train an RNN language model:

python train_rnn.py

Tomita 6/7 should both reach 100% dev accuracy within 2 epochs on the default seed:

python train_rnn.py --lang=Tom6 --n_train=100000

To extract a DFA from an RNN, and create a plot of accuracy vs number of data in {1, ..., 20} (doesn't require openfst):

python extract_dfa.py --lang=Tom6 --n_train_low=2 --n_train_high=20

To train an RNN for 100 epochs and save all checkpoints:

python train_rnn.py --save_name=Tom7-100 --lang=Tom7 --save_all --n_epochs=100 --stop_threshold=1000

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
cached		cached
models		models
results		results
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
automaton.py		automaton.py
compare_lengths_Tom7.sh		compare_lengths_Tom7.sh
create_plot.py		create_plot.py
create_table.sh		create_table.sh
eval_by_epoch.py		eval_by_epoch.py
eval_by_n_data.py		eval_by_n_data.py
extract_dfa.py		extract_dfa.py
extract_dfa_dif_epochs.sh		extract_dfa_dif_epochs.sh
extract_faithful.sh		extract_faithful.sh
extract_truelabels.sh		extract_truelabels.sh
isyms.txt		isyms.txt
kmeans_baseline.py		kmeans_baseline.py
languages.py		languages.py
models.py		models.py
osyms.txt		osyms.txt
pythomata_wrapper.py		pythomata_wrapper.py
rnn_test_pred.py		rnn_test_pred.py
sampling.py		sampling.py
threshold_effect.sh		threshold_effect.sh
train_models.sh		train_models.sh
train_rnn.py		train_rnn.py
trie.py		trie.py
unit_test.py		unit_test.py
utils.py		utils.py