GraphRAG Example

An experiment to compare "regular" (hereafter referred to as baseline) RAG and GraphRAG.

Experiment

I indexed my favourite plasma physics paper by Alex Scheckochihin in two different ways. Before indexing, I extracted the pdf into a single text file using the code in data_extraction.py

Baseline RAG -- chunking the text, calculating embeddings and storing them in pinecone
GraphRAG -- indexing the graph using their built-in indexing pipeline

And then ran queries against both.

The query used was What are the main themes of this article?. The results were:

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
0704.0044v4.pdf		0704.0044v4.pdf
LICENSE		LICENSE
README.md		README.md
baseline_rag.py		baseline_rag.py
baseline_rag_response.png		baseline_rag_response.png
data_extraction.py		data_extraction.py
graphrag_response.png		graphrag_response.png