This repository is the official implementation of the paper <Language Models as Inductive Reasoners>.
[Arxiv version].
In general, with this repository, you can
(1) generate hypotheses with the CoLM framework, and
(2) display results listed in the paper.
Will be updated soon.
Automatic evaluation (a part of Table 4, full Table 5, and full Table 6): python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene/ --generator_model_type gptj --if_long_or_short_facts 1 --cnt_facts_as_input 3 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0
Human evaluation (a part of Table 4): python final_human_eval_result.py --output_dir ./Checkpoints/gptj_analysis_100test_newdata_newprompt_10 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0
Automatic evaluation (a part of Table 4): python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene/ --generator_model_type gptj --if_long_or_short_facts 1 --cnt_facts_as_input 3 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 3 --if_already_fintuned_for_test 1
Human evaluation (a part of Table 4): python final_human_eval_result.py --output_dir ./Checkpoints/gptj_analysis_100test_newdata_newprompt_10 --setting_selection_M1_forM2M3 1 --setting_selection 3 --if_already_fintuned_for_test 1
Long fact, 1 full fact: python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene_1fact_long/ --generator_model_type gptj --if_long_or_short_facts 0 --cnt_facts_as_input 1 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0
Short fact, 1 full fact: python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene_1fact/ --generator_model_type gptj --if_long_or_short_facts 1 --cnt_facts_as_input 1 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0
Short fact, 2 full facts: python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene_2fact/ --generator_model_type gptj --if_long_or_short_facts 1 --cnt_facts_as_input 2 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0
Short fact, 3 missing facts: python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_gptj_12_5gene_missingfacts/ --generator_model_type gptj --if_long_or_short_facts 1 --cnt_facts_as_input 3 --if_full_or_missing_facts 1 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0
python bleu_green_calculator_analysis.py --output_dir ./Checkpoints/new_data_llama_12_5gene_capitalYesNo/ --generator_model_type llama --if_long_or_short_facts 1 --cnt_facts_as_input 3 --if_full_or_missing_facts 0 --setting_selection_M1_forM2M3 1 --setting_selection 2 --if_already_fintuned_for_test 0