Skip to content

A graph-based deep learning for predicting pseudogene functions by borrowing information from coding genes

License

Notifications You must be signed in to change notification settings

yanzhanglab/Pseudo2GO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pseudo2GO

Description

This is a graph-based deep learning method for predicting pseudogene functions by borrowing information from coding genes. We use both network information and node attributes to improve the performance. Sequence similarity networks are used to construct graphs connecting pseudogenes and coding genes, which are used to propagate node attribtues, so that pseudogenes can borrow information from well-studied coding genes.

We use two types of expression profiles (from TCGA and GTEx database, respectively), interactions with microRNAs and PPI and genetic interactions as the node attributes (initial feature representation).

We have shown that our method achieved state-of-the-art performance, significantly outperforming existing methods. Our graph neural network model is implemented based on Pytorch Geometric package in Python 3.6.

Citing

If you find our work is useful for your research, please consider citing our work:

@ARTICLE{10.3389/fgene.2020.00807,
AUTHOR={Fan, Kunjie and Zhang, Yan},   
TITLE={Pseudo2GO: A Graph-Based Deep Learning Method for Pseudogene Function Prediction by Borrowing Information From Coding Genes},      
JOURNAL={Frontiers in Genetics},      
VOLUME={11},      
PAGES={807},     
YEAR={2020},      
URL={https://www.frontiersin.org/article/10.3389/fgene.2020.00807},       
DOI={10.3389/fgene.2020.00807},      
ISSN={1664-8021}
}

Usage

Requirements

  • Python 3.6
  • Pytorch
  • Pytorch Geometric
  • networkx
  • scipy
  • numpy
  • pickle
  • scikit-learn
  • pandas

Data

You can download the raw data and processed data (ready for use in the model) from here data. Please Download the datasets and put them in the existing data folder.

Steps

Step1: decompress data files

unzip data.zip
unzip raw_data.zip
unzip final_input.zip
mv raw_data final_input data

Step2: preprossing (Optional)

cd preprocessing
python preprocess_final.py

Step2: run the model

cd model
python pseudo2go.py
Note there are several parameters can be tuned. Please refer to the pseudo2go.py file for detailed description of all parameters

About

A graph-based deep learning for predicting pseudogene functions by borrowing information from coding genes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages