Releases · zetaalphavector/RAGElo

29 Aug 09:19

ArthurCamara

0.1.8

ea06de1

v0.1.8 Latest

Latest

New features:

The Query object now supports two new methods for easier evaluation of your retrieval pipeline:

query.get_runs() returns a dictionary of TREC-style runs for all the agents that retrieved documents for that query. (the mapping is agent_id -> query_id->document_id->retrieval_score).
query.get_qrels() returns a TREC-style qrels dictionary with the judgement scores assigned by an Evaluator. The mapping is query_id->document_id->relevance).

You can explore how these two methods work in the new example notebook here that uses the ir-measures package.

Another addition (by @RodrJ106) is the addition of a new LLMProvider for Ollama! Now you can also run RAGElo on locally, without the need to call an external provider. Thanks!

A potential breaking change is that the retrieved_docs and the answers attributes of the Query object are now dictionaries instead of lists (mapping the document id or the agent name, respectively, to the actual object). This was done to better support future changes where RAGElo relies less on CSV files everywhere, but instead saves and serializes its internal state as a dictionary, until the user actually asks for an output as a CSV.

What's Changed

Add missing f-string to warning. by @RodrJ106 in #38
Add ollama as new llm provider by @RodrJ106 in #39
Remove extra domain sentence by @din0s in #40
Add get_qrels and get_runs for Queries by @ArthurCamara in #41

New Contributors

@RodrJ106 made their first contribution in #38
@din0s made their first contribution in #40

Full Changelog: 0.1.7...0.1.8

Contributors

ArthurCamara, din0s, and RodrJ106

Assets 2

23 Aug 09:06

ArthurCamara

0.1.7

499851c

0.1.7

What's Changed

Python3.8 fixes by @ArthurCamara in #37

Full Changelog: 0.1.6...0.1.7

Contributors

ArthurCamara

Assets 2

02 Jul 14:58

ArthurCamara

0.1.6

91d7bd3

v0.1.6

What's Changed

Fix issue with RDNAM parsing of answer by @matprst in #32
docs: update README.md by @eltociear in #33
Elo Ranker returns dictionary with agents scores by @ArthurCamara in #34

New Contributors

@matprst made their first contribution in #32
@eltociear made their first contribution in #33

Full Changelog: 0.1.5...0.1.6

Contributors

ArthurCamara, eltociear, and matprst

Assets 2

31 May 14:02

ArthurCamara

0.1.5

d868c81

v0.1.5

Adds support to Python >= 3.8

What's Changed

Support Python 3.8 by @ArthurCamara in #29

Full Changelog: 0.1.3...0.1.5

Contributors

ArthurCamara

Assets 2

31 May 11:44

ArthurCamara

0.1.4

6e7cc2a

v0.1.4

Hotfix for Python3.10

Assets 2

31 May 10:07

ArthurCamara

0.1.2

72a1189

0.1.2

Main changes:

OpenAI calls are much faster now and can be done in parallel.
The pairwise answer evaluations are easier to use and more configurable.
A new PairwiseExpertAnswerEvaluator evaluator was added.
Added a notebook with examples of using RAGElo as a library.

What's Changed

Added parallel calls to OpenAI with asyncio by @ArthurCamara in #21
Change from aiohttp sessions to using OpenAI's Async clients. by @ArthurCamara in #22
Improve batching by @ArthurCamara in #25
Refactor pairwise answer eval by @frejonb in #26
Notebook example by @ArthurCamara in #27

Full Changelog: 0.1.1...0.1.2

Contributors

ArthurCamara and frejonb

Assets 2

16 Apr 10:38

ArthurCamara

0.1.1

ebbd6be

v0.1

RAGElo goes 0.1!

In this release, RAGElo as a library was completely revamped, with a much easier to use unified interface, simpler to use commands (evaluate and batch_evaluate). Now using an Evaluator is a simple as calling evaluator.evaluate("query", "document").

Custom Evaluators and metadata support

Not a fan of the existing evaluators? Now both Retrieval and Answer evaluators support fully custom promptings using the RetrievalEvaluator.CustomPromptEvaluator and AnswerEvaluator.CustomPromptEvaluator, respectively.

As part of the custom evaluators, now RAGElo also supports custom metadata injection into your prompts! Want to include the current timestamp into your evaluator? Add a {today_date} placeholder to the prompt and pass it as a metadata to the evaluate method:

from ragelo import get_retrieval_evaluator

prompt = """You are a helpful assistant for evaluating the relevance of a retrieved document to a user query.
You should pay extra attention to how **recent** a document is. A document older than 5 years is considered outdated.

The answer should be evaluated according tot its recency, truthfulness, and relevance to the user query.

User query: {q}

Retrieved document: {d}

The document has a date of {document_date}.
Today is {today_date}.

WRITE YOUR ANSWER ON A SINGLE LINE AS A JSON OBJECT WITH THE FOLLOWING KEYS:
- "relevance": 0 if the document is irrelevant, 1 if it is relevant.
- "recency": 0 if the document is outdated, 1 if it is recent.
- "truthfulness": 0 if the document is false, 1 if it is true.
- "reasoning": A short explanation of why you think the document is relevant or irrelevant.
"""

evaluator = get_retrieval_evaluator(
    "custom_prompt", # name of the retrieval evaluator
    llm_provider="openai", # Which LLM provider to use
    prompt=prompt, # your custom prompt
    query_placeholder="q", # the placeholder for the query in the prompt
    document_placeholder="d", # the placeholder for the document in the prompt
    answer_format="multi_field_json", # The format of the answer. In this case, a JSON object with multiple fields
    scoring_keys=["relevance", "recency", "truthfulness", "reasoning"], # Which keys to extract from the answer
)

raw_answer, answer = evaluator.evaluate(
    query="What is the capital of Brazil?", # The user query
    document="Rio de Janeiro is the capital of Brazil.", # The retrieved document
    query_metadata={"today_date": "08-04-2024"}, # Some metadata for the query
    doc_metadata={"document_date": "04-03-1950"}, # Some metadata for the document
)

CLI Interface changes

In the CLI front, each evaluator has its own subprogram now. Instead of calling ragelo with a long list of parameters, you can call ragelo retrieval-evaluator <evaluator> or ragelo answer-evaluator <evaluator> with your preferred evaluator. (We are big fans of the ragelo retrieval-evaluator domain-expert 😉 ).

Other changes:

Moved from using dataclasses to Pydantic's BaseModel. The code should support Pydantic >=0.9, but let us know if it doesn't work for you.
Calling batch_evaluator will now return both the existing and new annotations, instead of only writing to new annotations to a file.
Interface of the batch_evaluator is much simplified. Now, instead of a dictionary of dictionaries, it requires a list of Query, and each query have its own list of documents and answers.
PairwiseAnswerEvaluator is much simplified now. k is the number of games to generate per query, instead of the grand total.
Many specific methods are simplified and moved upper in the class hierarchy. More code sharing and easier to maintain!

Full Changelog: 0.0.5...0.1.0

Assets 2

15 Feb 15:49

ArthurCamara

0.0.5

27cc16c

v0.0.5

What's Changed

Major overhaul to the code!

More modular
Tests
Simpler and more Coherent class interface
Simpler iterators
Update OpenAI version

by @ArthurCamara in #7

Full Changelog: 0.0.3...0.0.5

Contributors

ArthurCamara

Assets 3

25 Oct 15:06

ArthurCamara

0.0.3

c4c785a

0.0.3

Added a new document evaluator (domain_expert) and a bunch of bugfixes.

What's Changed

Adding Domain Expert Evaluator by @ArthurCamara in #5

Full Changelog: 0.0.2...0.0.3

Contributors

ArthurCamara

Assets 2

23 Oct 13:38

ArthurCamara

0.0.2

8310c85

0.0.2

First public release of RAGElo, an LLM powered annotator for RAG Agents using an Elo-style tournament

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New features:

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

RAGElo goes 0.1!

Custom Evaluators and metadata support

CLI Interface changes

Other changes:

What's Changed

Contributors

What's Changed

Contributors

Releases: zetaalphavector/RAGElo

v0.1.8

New features:

What's Changed

New Contributors

Contributors

0.1.7

What's Changed

Contributors

v0.1.6

What's Changed

New Contributors

Contributors

v0.1.5

What's Changed

Contributors

v0.1.4

0.1.2

What's Changed

Contributors

v0.1

RAGElo goes 0.1!

Custom Evaluators and metadata support

CLI Interface changes

Other changes:

v0.0.5

What's Changed

Contributors

0.0.3

What's Changed

Contributors

0.0.2