In this document, we will introduce how to reproduce the results of various methods listed in our table under a unified setting. For specific settings and explanations of each method, please refer to implementation details. It is recommended to have some basic understanding of our repository beforehand, which can be found in introduction for beginners.
- Install FlashRAG and dependencies
- Download Llama3-8B-instruct, E5-base-v2
- Download datasets (you can download from our repo: here)
- Download retrieval corpus (from here)
- Build index for retrieval, using E5 for embedding (see how to build index?)
All the code used is based on the repository's example/methods. We have set appropriate hyperparameters for various methods. If you need to adjust them yourself, you can refer to the config dictionary provided for each method and the original papers of each method.
First, you need to fill in the paths of various downloads in my_config.yaml
. Specifically, you need to fill in the following four fields:
- model2path: Replace the paths of E5 and Llama3-8B-instruct models with your own paths
- method2index: Fill in the path of the index file built using E5
- corpus_path: Fill in the path of the Wikipedia corpus file in
jsonl
format - data_dir: Change to the download path of your own dataset
For some methods that require the use of additional models, extra steps are required. We will introduce the methods that need extra steps below. If you know that the method you want to run does not need these steps, you can skip directly to the third section.
Table of Contents:
This method requires using a new retriever, so you need to download the retriever and build the index.
- Additional Step1: Download AAR-Contriever (from here)
- Additional Step2: Build the index for AAR-Contriever (note that the pooling method should be 'mean')
- Additional Step3: Modify the
index_path
andmodel2path
in theAAR
function inrun_exp.py
.
This method requires downloading Llama2-7B.
- Additional Step1: Download Llama2-7B (from here)
- Additional Step2: Modify the
refiner_model_path
in thellmlingua
function inrun_exp.py
This method requires downloading three checkpoints trained by the authors (trained on NQ, TQA, and HotpotQA respectively).
- Additional Step1: Download the author's checkpoints (NQ Model, TQA Model, HotpotQA Model)
- Additional Step2: Fill in the downloaded model paths in the
model_dict
of therecomp
function
This method requires downloading GPT2.
- Additional Step1: Download GPT2 (from here)
- Additional Step2: Modify the
refiner_model_path
in thesc
function inrun_exp.py
This method requires downloading the Lora trained by the authors and downloading the Llama2-13B model to load the Lora.
- Additional Step1: Download Llama2-13B (from here)
- Additional Step2: Download the author's trained Lora, trained on NQ (from here) and trained on 2WikiMultihopQA (from here)
- Additional Step3: Modify the corresponding Lora paths in the
model_dict
of theretrobust
function and the Llama2-13B path inmy_config.yaml
We recommend adjusting the single_hop
parameter in the SelfAskPipeline
according to different datasets, which controls whether to decompose the query. For NQ, TQA, PopQA, WebQ
, we set single_hop
to True
.
This method requires an embedding model and training data used during the inference stage. We provide the training data given by the authors. If you wish to use your own training data, you can generate it according to the format of the training data and the original paper.
- Additional Step1: Download the embedding model (from here)
- Additional Step2: Download the training data (from here)
- Additional Step3: Fill in the embedding model path in the
model_path
of theskr
function - Additional Step4: Fill in the training data path in the
training_data_path
of theskr
function
This method requires using a trained model and currently only supports running in the vllm
framework.
- Additional Step1: Download the Self-RAG model (from 7B model, 13B model)
- Additional Step2: Modify the
generator_model_path
in theselfrag
function.
This method requires a virtual token embedding file and currently only supports running in the hf
framework.
- Additional Step1: Download virtual token embedding file from official repo
- Additional Step2: Modify the
token_embedding_path
in thespring
function.
This method requires a classifier to classify the query. Since the author did not provide an official checkpoint, we used a checkpoint trained by others on Huggingface for the experiment (which may result in inconsistent results).
If the official open-source checkpoint is released in the future, we will update the experimental results.
- Additional Step1: Download classifier model from huggingface repo (not official): illuminoplanet/combined_flan_t5_xl_classifier
- Additional Step2: Modify the
model_path
inadaptive
function.
Run the experiment on the NQ dataset using the following command.
python run_exp.py --method_name 'naive' \
--split 'test' \
--dataset_name 'nq' \
--gpu_id '0,1,2,3' \
--ms_gpu_id '0'
The method can be selected from the following:
naive zero-shot AAR-contriever llmlingua recomp selective-context sure replug skr flare iterretgen ircot trace