LLM Evaluation

This project aims to evaluate Large Language Models performance on different NLP tasks in combination with various prompts

Environment

Create a virtualenv and install requirements

make virtualenv

Then pull the data

dvc pull

Note that for DVC to work you need access to Mantis AWS

To be filled