FIA: An Iterative Analysis Framework for Enhanced Variant Classification in Large Mendelian Disease Cohorts

Overview

To address these limitations and improve the efficiency and accuracy of genetic diagnosis for Mendelian diseases, we developed the Framework of Iterative Analysis (FIA). This framework integrated the construction of disease mutation databases, automated retrieval of ACMG/AMP evidence, adjustments to specific rules, and the classification of variant pathogenicity.

Figure 1 illustrates the primary steps involved in FIA.

Figure 1. Schematic overview of FIA.

Installation

FIA scripts

Install FIA from GitHub:

git clone https://github.com/liusihan/FIA.git

Conda

In order to install the dependencies of FIA, we recommend using a dedicated conda environment. First, you will need the Conda Python distribution and package manager.

# Download conda installer
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh

# Set permissions to execute
chmod +x Miniconda3-latest-Linux-x86_64.sh 	

# Execute. Make sure to "yes" to add the conda to your PATH
sh ./Miniconda3-latest-Linux-x86_64.sh 		

# Add channels
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

After installing Miniconda, run the following commands to install FIA's dependencies:

conda env create -f FIA.yaml
conda activate FIA
pip install -r requirements.txt

Download annotation files

FIA use VEP and vcfanno to determine the effect of variants (SNVs, insertions, deletions) on genes, transcripts, and protein sequence. To get annotation for the variant, indexed_vep_cache (homo_sapiens_refseq 105_GRCh37 and 105_GRCh38) and fasta files are required. VEP cache and plugins files can be downloaded as follows:

# indexed vep cache
mkdir $HOME/.vep
cd $HOME/.vep
wget https://ftp.ensembl.org/pub/release-105/variation/vep/homo_sapiens_refseq_vep_105_GRCh38.tar.gz
wget https://ftp.ensembl.org/pub/release-105/variation/vep/homo_sapiens_refseq_vep_105_GRCh37.tar.gz
tar xzf homo_sapiens_refseq_vep_105_GRCh38.tar.gz
tar xzf homo_sapiens_refseq_vep_105_GRCh37.tar.gz
wget https://ftp.ensembl.org/pub/release-105/variation/vep/homo_sapiens_merged_vep_105_GRCh38.tar.gz
wget https://ftp.ensembl.org/pub/release-105/variation/vep/homo_sapiens_merged_vep_105_GRCh37.tar.gz
tar xzf homo_sapiens_merged_vep_105_GRCh38.tar.gz
tar xzf homo_sapiens_merged_vep_105_GRCh37.tar.gz
mkdir Plugins
cp -r FIA/Plugins ./

#fasta file
cd FIA/Scripts/AutoPVS1/data
wget https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz
gunzip hg19.fa.gz
mv GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz hg38.fa.gz
gunzip hg38.fa.gz
cat hg19.fa | sed 's/^>chr/>/g' >hg19.nochr.fa
cat hg38.fa | sed 's/^>chr/>/g' >hg38.nochr.fa
samtools faidx hg19.fa
samtools faidx hg38.fa
samtools faidx hg19.nochr.fa
samtools faidx hg38.nochr.fa

The database and Scripts directory contain the data and conf for a full example of the GJB2 gene. To run with a complete genome, users needs download the appropriate databases and reverse the file name in the anno.demo.conf file.

USAGE

cd FIA
sh FIA_annotation.sh example/1000G_GJB2.vcf.gz GRCh37 Scripts/AutoPVS1/data/hg19.nochr.fa $HOME/.vep test
sh FIA_classification.sh test/annotated_raw.AF_stats.vcf.gz GRCh37 test/FIA
sh test/FIA/bin/run_FIA.sh

Citation

If you use FIA, please cite our paper (thanks!):

Liu S, et.al,. FIA: An Iterative Analysis Framework for Enhanced Variant Classification in Large Mendelian Disease Cohorts. XXX.

Getting help

If you encounter a clear bug, please file an issue with a minimal reproducible example on GitHub. For questions and other discussion, please contact Sihan Liu ([email protected]).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FIA: An Iterative Analysis Framework for Enhanced Variant Classification in Large Mendelian Disease Cohorts

Overview

Installation

FIA scripts

Conda

Download annotation files

USAGE

Citation

Getting help

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Plugins		Plugins
Scripts		Scripts
bin		bin
database		database
example		example
inputfile		inputfile
test		test
.gitignore		.gitignore
FIA.yaml		FIA.yaml
FIA_annotation.sh		FIA_annotation.sh
FIA_classification.sh		FIA_classification.sh
Figure1.jpeg		Figure1.jpeg
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

liusihan/FIA

Folders and files

Latest commit

History

Repository files navigation

FIA: An Iterative Analysis Framework for Enhanced Variant Classification in Large Mendelian Disease Cohorts

Overview

Installation

FIA scripts

Conda

Download annotation files

USAGE

Citation

Getting help

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages