This repository contains the source code to reproduce the preprocessing workflow for COMPOUND, CRISPR and ORF data from the JUMP dataset.
We suggest uv for environment management. The following commands create the environment from scratch and install the required packages.
uv sync
uv pip install -e .
source .venv/bin/activate
Download profiles and metadata for compound
(crispr
or orf
):
source download_data.sh compound
snakemake -c1 --configfile inputs/config/compound.json
To run the tests, first set your PYTHONPATH to include the repository root:
export PYTHONPATH=$(pwd)
pytest
The test suite includes an integration test to verify the pipeline's functionality using a minimal dataset.