Akita Utils is a set of functions to aid analysis of Akita. Akita is a deep learning CNN model that predicts contact frequency maps from the DNA sequence. This repository includes scripts and tools to efficiently analyze model's predictions.
Scripts used for cross-species AkitaV2 training and model weights are available from the Basenji repository.
Akita Utils have been used to perform in silico experiments to extract the sequence contributions of CTCF to genome folding. The code for these experiments and computational analysis of Akita.V2’s predictions can be found in the akitaV2-analyses repository.
Preprint available here: link to be added very soon
To install Akita Utils, run the following commands:
git clone https://github.com/Fudenberg-Research-Group/akita_utils.git
cd akita_utils
make install
Working environment specifying requirements can be installed as
conda env create -f basenji_py3.9_tf2.15.yml
Alternatively, install the requirements below:
- numpy
- pandas
- scipy
- tensorflow
- h5py
- bioframe
- seaborn
For usage examples, please refer to the akitaV2-analyses repository.
We recommend starting with the following tutorials:
akitaV2-analyses/tutorials/disruption_tutorial.ipynb
akitaV2-analyses/tutorials/insertion_tutorial.ipynb
These tutorials will help you understand the basic functionalities and applications of akita_utils.
Feedback and questions are appreciated. Please contact us at: fudenber at usc fullstop edu & smaruj at usc fullstop edu.
./akita_utils
: Contains helper functions split by application, e.g., dna_utils, h5_utils, seq_genes../cli
: Contains a script for collecting h5 files output jobs../tests
: Contains test functions for the akita_utils functions.
This project is licensed under the MIT License.