Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
slightly add more instructions to the readme file in RAG
  • Loading branch information
jianguoz authored Aug 4, 2021
1 parent d7dff60 commit 2fafacc
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions parlai/agents/rag/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,8 @@ python generate_dense_embeddings.py -mf zoo:hallucination/multiset_dpr/hf_bert_b
--outfile /tmp/wiki_passage_embeddings/wiki_passages --num-shards 50 --shard-id 0 -bs 32
```

Note `--num-shards` should be a [small value](https://github.com/facebookresearch/ParlAI/blob/master/parlai/agents/rag/retrievers.py#L644) if the passages file only has a few entries; otherwise it may raise the `NaN` error when the number of retrieved passages, i.e., `--n-docs,` is a large value.

### 3. Index the Dense Embeddings

The final step is to build the full FAISS index from these dense embeddings. You can use the [`index_dense_embeddings.py`](https://github.com/facebookresearch/ParlAI/blob/master/parlai/agents/rag/scripts/index_dense_embeddings.py) script to achieve this. You can choose one of the following options when indexing your embeddings for varying results, depending on the size of your dataset:
Expand Down

0 comments on commit 2fafacc

Please sign in to comment.