Skip to content

Commit

Permalink
Merge pull request #10 from Fudenberg-Research-Group/updates
Browse files Browse the repository at this point in the history
Updates
  • Loading branch information
hrahmanin authored Jan 20, 2025
2 parents 7c1e829 + 9702a02 commit 43db1a7
Show file tree
Hide file tree
Showing 5 changed files with 63 additions and 156 deletions.
101 changes: 35 additions & 66 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,85 +1,54 @@
# Chromoscores

# Python Project Template
![Alt Text](./docs/representations.png)

A low dependency and really simple to start project template for Python Projects.
A Python package for quantitative analysis of simulated Hi-C maps, providing tools to capture, process and evaluate chromatin interaction patterns such as Topoligically Associating Domains (TADs), flames, and peaks.

See also
- [Flask-Project-Template](https://github.com/rochacbruno/flask-project-template/) for a full feature Flask project including database, API, admin interface, etc.
- [FastAPI-Project-Template](https://github.com/rochacbruno/fastapi-project-template/) The base to start an openapi project featuring: SQLModel, Typer, FastAPI, JWT Token Auth, Interactive Shell, Management Commands.

### HOW TO USE THIS TEMPLATE
### Requirement 📃
- numpy

> **DO NOT FORK** this is meant to be used from **[Use this template](https://github.com/rochacbruno/python-project-template/generate)** feature.

### Structure of the repository
The structure of this repository follows as below:
- maputils : Required functions for processing maps such as obsdrved over expected, or piling up snippets with specific features.
- scorefunctions : functions for quantitative analysis of features.
- snipping: functions for capturing snippets containing specific features.
- analysis: notebooks and code as tutorials for analyzing simulated data.

1. Click on **[Use this template](https://github.com/rochacbruno/python-project-template/generate)**
3. Give a name to your project
(e.g. `my_awesome_project` recommendation is to use all lowercase and underscores separation for repo names.)
3. Wait until the first run of CI finishes
(Github Actions will process the template and commit to your new repo)
4. If you want [codecov](https://about.codecov.io/sign-up/) Reports and Automatic Release to [PyPI](https://pypi.org)
On the new repository `settings->secrets` add your `PYPI_API_TOKEN` and `CODECOV_TOKEN` (get the tokens on respective websites)
4. Read the file [CONTRIBUTING.md](CONTRIBUTING.md)
5. Then clone your new project and happy coding!

> **NOTE**: **WAIT** until first CI run on github actions before cloning your new project.
### What is included on this template?

- 🖼️ Templates for starting multiple application types:
* **Basic low dependency** Python program (default) [use this template](https://github.com/rochacbruno/python-project-template/generate)
* **Flask** with database, admin interface, restapi and authentication [use this template](https://github.com/rochacbruno/flask-project-template/generate).
**or Run `make init` after cloning to generate a new project based on a template.**
- 📦 A basic [setup.py](setup.py) file to provide installation, packaging and distribution for your project.
Template uses setuptools because it's the de-facto standard for Python packages, you can run `make switch-to-poetry` later if you want.
- 🤖 A [Makefile](Makefile) with the most useful commands to install, test, lint, format and release your project.
- 📃 Documentation structure using [mkdocs](http://www.mkdocs.org)
- 💬 Auto generation of change log using **gitchangelog** to keep a HISTORY.md file automatically based on your commit history on every release.
- 🐋 A simple [Containerfile](Containerfile) to build a container image for your project.
`Containerfile` is a more open standard for building container images than Dockerfile, you can use buildah or docker with this file.
- 🧪 Testing structure using [pytest](https://docs.pytest.org/en/latest/)
- ✅ Code linting using [flake8](https://flake8.pycqa.org/en/latest/)
- 📊 Code coverage reports using [codecov](https://about.codecov.io/sign-up/)
- 🛳️ Automatic release to [PyPI](https://pypi.org) using [twine](https://twine.readthedocs.io/en/latest/) and github actions.
- 🎯 Entry points to execute your program using `python -m <chromoscores>` or `$ chromoscores` with basic CLI argument parsing.
- 🔄 Continuous integration using [Github Actions](.github/workflows/) with jobs to lint, test and release your project on Linux, Mac and Windows environments.

> Curious about architectural decisions on this template? read [ABOUT_THIS_TEMPLATE.md](ABOUT_THIS_TEMPLATE.md)
> If you want to contribute to this template please open an [issue](https://github.com/rochacbruno/python-project-template/issues) or fork and send a PULL REQUEST.
[❤️ Sponsor this project](https://github.com/sponsors/rochacbruno/)

<!-- DELETE THE LINES ABOVE THIS AND WRITE YOUR PROJECT README BELOW -->

---
# chromoscores

[![codecov](https://codecov.io/gh/Fudenberg-Research-Group/chromoscores/branch/main/graph/badge.svg?token=chromoscores_token_here)](https://codecov.io/gh/Fudenberg-Research-Group/chromoscores)
[![CI](https://github.com/Fudenberg-Research-Group/chromoscores/actions/workflows/main.yml/badge.svg)](https://github.com/Fudenberg-Research-Group/chromoscores/actions/workflows/main.yml)

Awesome chromoscores created by Fudenberg-Research-Group

## Install it from PyPI

### Installation 📦
First,

```
git https://github.com/Fudenberg-Research-Group/chromoscores.git
```
then
```bash
pip install chromoscores
```

## Usage

```py
from chromoscores import BaseClass
from chromoscores import base_function

BaseClass().base_method()
base_function()
```

```bash
$ python -m chromoscores
#or
$ chromoscores
```
### Analysis 📊
Observable features can be quantified, including:

- Observed over expected
- TADs (Topologically Associating Domains)
- flames
- Dots (loops between barriers)


See tutorials in `./jupyter_notebooks`.



[![codecov](https://codecov.io/gh/Fudenberg-Research-Group/chromoscores/branch/main/graph/badge.svg?token=chromoscores_token_here)](https://codecov.io/gh/Fudenberg-Research-Group/chromoscores)
[![CI](https://github.com/Fudenberg-Research-Group/chromoscores/actions/workflows/main.yml/badge.svg)](https://github.com/Fudenberg-Research-Group/chromoscores/actions/workflows/main.yml)


## Development

Read the [CONTRIBUTING.md](CONTRIBUTING.md) file.
99 changes: 18 additions & 81 deletions chromoscores/maputils.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,13 @@ def get_diagonal_pileup(contact_map, boundary_list, window_size = 10):
"""
parameters
----------
contact_map: contact map
boundary_list: list of the boundary elements positions on the diagonal
window_size: size of the window
contact_map: contact map (2D array)
boundary_list: list of the boundary elements' positions on the diagonal
window_size: size of the window (must be odd for center)
Returns
-------
a stackup of snippts around the boundary elements
a stackup of snippets around the boundary elements
"""

if window_size <= 0 or window_size > len(contact_map):
Expand Down Expand Up @@ -75,9 +75,7 @@ def get_offdiagonal_pileup_binlist(
----------
contact_map: contact map
boundary_list: list of the boundary elements positions on the diagonal
min_dist: minimum distance from the diagonal
max_dist: maximum distance from the diagonal
bin_num: number of bins
binlist : exact list of bin boundaries
window_size: size of the window for the pileup
Returns
Expand Down Expand Up @@ -113,15 +111,13 @@ def get_offdiagonal_pileup_binlist_orientation(
contact_map: contact map
boundary_list: list of the boundary elements positions on the diagonal
orientation: list of the boundary element orientations
min_dist: minimum distance from the diagonal
max_dist: maximum distance from the diagonal
bin_num: number of bins
binlist: exact list of bins boundaries
window_size: size of the window for the pileup
Returns
-------
a list of pileups as numpy arrays around the feature (e.g., peaks) as a function of distance from the diagonal
and orientation
a list of pileups as numpy arrays around the feature (e.g., peaks) as a function of distance from the diagonal,
orientation between barriers, and the number of snippets at each range.
"""
bin_border_int = binlist
bin_num = len(bin_border_int)
Expand All @@ -136,7 +132,10 @@ def get_offdiagonal_pileup_binlist_orientation(
mat_tandn = np.zeros((window_size, window_size))

dist = (bin_border_int[i] + bin_border_int[i + 1]) / 2

n_conv = 0
n_dive = 0
n_tand_p = 0
n_tand_n = 0
for i_element in boundary_list:
for j_element in boundary_list:
if bin_border_int[i] <= (j_element - i_element) < bin_border_int[i + 1]:
Expand All @@ -146,98 +145,36 @@ def get_offdiagonal_pileup_binlist_orientation(
]
if orientation[np.flatnonzero(boundary_list==np.max([i_element, j_element]))] == '+':
if orientation[np.flatnonzero(boundary_list==np.min([i_element, j_element]))] == '-':
n_conv += 1
mat_conv += contact_map[
i_element - window_size // 2 : i_element + window_size // 2,
j_element - window_size // 2 : j_element + window_size // 2,
]
else:
n_tand_p +=1
mat_tandp += contact_map[
i_element - window_size // 2 : i_element + window_size // 2,
j_element - window_size // 2 : j_element + window_size // 2,
]
else:
if orientation[np.flatnonzero(boundary_list==np.min([i_element, j_element]))] == '+':
n_dive +=1
mat_dive += contact_map[
i_element - window_size // 2 : i_element + window_size // 2,
j_element - window_size // 2 : j_element + window_size // 2,
]
else:
n_tand_n +=1
mat_tandn += contact_map[
i_element - window_size // 2 : i_element + window_size // 2,
j_element - window_size // 2 : j_element + window_size // 2,
]

pile_ups.extend([[['+-',dist,mat_conv],['-+',dist,mat_dive],['++',dist,mat_tandp],['--',dist,mat_tandn],['all',dist,mat]]])
n_tot = n_conv + n_dive + n_tand_p + n_tand_n
pile_ups.extend([[['+-',dist,mat_conv, n_conv],['-+',dist,mat_dive, n_dive],['++',dist,mat_tandp, n_tand_p],['--',dist,mat_tandn, n_tand_n],['all',dist,mat, n_tot]]])

return pile_ups




def get_offdiagonal_pileup_orientation(contact_map, boundary_list, orientation, binlist, window_size=10):
"""
Parameters
----------
contact_map : np.array
Contact map.
boundary_list : list
List of the boundary elements positions on the diagonal.
orientation : list
List of the boundary element orientations.
binlist : list
List of bin edges.
window_size : int, optional
Size of the window for the pileup (default is 10).
Returns
-------
pile_ups : list
A list of pileups as numpy arrays around the feature (e.g., peaks) as a function of distance from the diagonal.
"""

bin_num = len(binlist)

# Initialize matrices for storing pileups
pile_ups = []

for i in range(bin_num - 1):
dist = (binlist[i] + binlist[i + 1]) / 2

mat = np.zeros((window_size, window_size))
mat_conv = np.zeros((window_size, window_size))
mat_dive = np.zeros((window_size, window_size))
mat_tandp = np.zeros((window_size, window_size))
mat_tandn = np.zeros((window_size, window_size))

for i_element in boundary_list:
for j_element in boundary_list:
if binlist[i] <= (j_element - i_element) < binlist[i + 1]:
window_i_start = i_element - window_size // 2
window_i_end = i_element + window_size // 2
window_j_start = j_element - window_size // 2
window_j_end = j_element + window_size // 2

mat += contact_map[window_i_start:window_i_end, window_j_start:window_j_end]

max_orientation = orientation[np.flatnonzero(boundary_list == max(i_element, j_element))]
min_orientation = orientation[np.flatnonzero(boundary_list == min(i_element, j_element))]

if max_orientation == '+':
if min_orientation == '-':
mat_conv += contact_map[window_i_start:window_i_end, window_j_start:window_j_end]
else:
mat_tandp += contact_map[window_i_start:window_i_end, window_j_start:window_j_end]
else:
if min_orientation == '+':
mat_dive += contact_map[window_i_start:window_i_end, window_j_start:window_j_end]
else:
mat_tandn += contact_map[window_i_start:window_i_end, window_j_start:window_j_end]

pile_ups.append([['+-', dist, mat_conv], ['-+', dist, mat_dive], ['++', dist, mat_tandp], ['--', dist, mat_tandn], ['all', dist, mat]])

return pile_ups


def get_observed_over_expected(contact_map):
"""
parameters
Expand Down
16 changes: 8 additions & 8 deletions chromoscores/scorefunctions.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@


def peak_score_upperRight(
peak_snippet, peak_width=3, background_width=10, pseudo_count=0
peak_snippet, peak_width = 3, background_width = 10, pseudo_count = 0
):
"""
parameters
Expand Down Expand Up @@ -41,7 +41,7 @@ def peak_score_upperRight(


def peak_score_lowerRight(
peak_snippet, peak_width=3, background_width=10, pseudo_count=0
peak_snippet, peak_width = 3, background_width = 10, pseudo_count = 0
):
"""
parameters
Expand Down Expand Up @@ -77,7 +77,7 @@ def peak_score_lowerRight(


def peak_score_upperLeft(
peak_snippet, peak_width=3, background_width=10, pseudo_count=0
peak_snippet, peak_width = 3, background_width = 10, pseudo_count = 0
):
"""
parameters
Expand Down Expand Up @@ -113,7 +113,7 @@ def peak_score_upperLeft(


def peak_score_lowerLeft(
peak_snippet, peak_width=3, background_width=10, pseudo_count=0
peak_snippet, peak_width = 3, background_width = 10, pseudo_count = 0
):
"""
parameters
Expand Down Expand Up @@ -207,7 +207,7 @@ def _get_isolation_areas(contact_map, delta=1, diag_offset=3, max_distance=10, s
delta: distance from the border between in_tad and out_tad
diag_offset: distance of the snippet from the diagonal. This also determines the size of the snippet.
max_distance: maximum distance from the diagonal
state: 1 for triangle snippets, 0 for square snippets
snippet_shapes: shape of the snippets for taking the average.
returns
-------
Expand Down Expand Up @@ -271,7 +271,7 @@ def isolation_score(snippet, delta, diag_offset, max_dist, snippet_shapes , pseu
flames when extracting in_tad and out_tad areas.
diag_offset: distance from the diagonal. This also determines the size of the snippet.
max_distance: maximum distance from the diagonal
state: 1 for triangle snippets, 0 for square snippets
snippet_shapes: shape of the snippet for taking the average
pseudo_count: pseudo count to avoid division by zero
returns
Expand All @@ -292,7 +292,7 @@ def isolation_score(snippet, delta, diag_offset, max_dist, snippet_shapes , pseu


def flame_score_vertical(
flame_snippet, flame_thickness, background_thickness, pseudo_count=1
flame_snippet, flame_thickness, background_thickness, pseudo_count = 1
):
"""
parameters
Expand Down Expand Up @@ -322,7 +322,7 @@ def flame_score_vertical(


def flame_score_horizontal(
snippet, flame_thickness, background_thickness, pseudo_count=1
snippet, flame_thickness, background_thickness, pseudo_count = 1
):
"""
parameters
Expand Down
3 changes: 2 additions & 1 deletion chromoscores/snipping.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,11 +58,12 @@ def tad_snippet_sectors(
parameters
----------
contact_map: snippet of a contact map around a boundary element
boundary_list: boundary_list: list of the boundary elements positions on the diagonal
index: index of the boundary element in the boundary_list. This should be in the range of boundary_list.
delta: distance from the border between in_tad and out_tad. is defined to exclude
flames when extracting in_tad and out_tad areas.
diag_offset: distance from the diagonal. This also determines the size of the snippet.
max_distance: maximum distance from the diagonal
state: 1 for triangle snippets, 0 for square snippets
returns
-------
Expand Down
Binary file added docs/representations.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 43db1a7

Please sign in to comment.