Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
typo fixes
  • Loading branch information
ypnngaa-py authored Jan 15, 2020
1 parent 917ec29 commit 5c48196
Showing 1 changed file with 31 additions and 32 deletions.
63 changes: 31 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,15 @@ git clone https://github.com/Xinglab/PEGASAS.git
cd PEGASAS
python setup.py install
```
Note that the installation process will only automatically check and install python package dependencies. R packages required for PEGASAS have to be installed manually if missing. See [next section](#dependencies) for required packages.
Note that the installation process will only automatically check and install python package dependencies. If the R packages required for PEGASAS are missing, they can only be installed manually. See [next section](#dependencies) for required packages.

### Dependencies
python version 2.7 (numpy, scipy, matplotlib)

R version 3.4.0 (LSD, data.table, ggplot2)

### Performing PEGASAS analysis
After installed PEGASAS and its dependencies, following the below two steps to perform the analysis and generate plots for correlation and Gene Ontology analysis. ([A toy example](https://github.com/Xinglab/PEGASAS/tree/master/example) is provided for a test run and correponding commands are provided in the [next section](#example-pegasas-run).)
After installing PEGASAS and its dependencies, the user can follow the two steps below to perform the analysis and to generate plots for correlation and Gene Ontology (GO) analysis. ([A toy example](https://github.com/Xinglab/PEGASAS/tree/master/example) is provided for a test run. Corresponding commands are provided in the [next section](#example-pegasas-run).)

There are two steps to perform PEGASAS analysis, as shown below (typing PEGASAS -h in the command line):
```
Expand All @@ -37,7 +37,7 @@ positional arguments:
pathway Calculates signaling pathway activity derived from
geneset enrichment metric based on RNA-Seq gene
expression
correlation Computes pathway correlated alternative splicing
correlation Computes pathway-correlated alternative splicing
events
optional arguments:
Expand All @@ -47,33 +47,32 @@ optional arguments:
For command line options of each sub-command, type: PEGASAS COMMAND -h
```
##### Step 1: Pathway activity calculation
PEGASAS can calculate signaling pathway activity based pre-defined gene signatures and gene expression. Details see below:
PEGASAS can calculate the signaling pathway activity based on predefined gene signatures and gene expression. For details of this step, see below:
```
PEGASAS pathway -h
usage: PEGASAS pathway [-h] [-o OUT_DIR] [-n NUM_INTERVAL] [--plotting]
geneExpbySample geneSignatureList groupInfo
required arguments:
geneExpbySample A TSV format matrix of gene expression values (FPKM,
TPM, etc.) where each column is one sample and each
row is one gene.
geneSignatureList One or multiple gene signature sets of pathway of
interest in the format of 'gmt' (see MSigDB webset).
groupInfo A TSV format file provides patient ID and
phenotype/disease stage in each row.
geneExpbySample TSV format matrix of gene expression values (FPKM,
TPM, etc.), where each column is one sample and each
row is one gene
geneSignatureList One or multiple gene signature sets from pathway of
interest, in the 'gmt' format (see MSigDB webset)
groupInfo TSV format file, providing patient ID and
phenotype/disease stage in each row
optional arguments:
-h, --help show this help message and exit
-o OUT_DIR, --out-dir OUT_DIR
Output folder name of the analysis.
Name of folder for analysis output
-n NUM_INTERVAL, --num-interval NUM_INTERVAL
Number of KS enrichment calculation processes one
time.
--plotting Making plots to inspect K-S enrichment scores.
Number of parallel processes for KS enrichment calculation
--plotting Makes plots to inspect K-S enrichment scores
```

##### Step 2: Pathway activity-correlated events
PEGASAS can perform correlation analysis to identify pathway-associated events by taking pathway acitivity measurements generated in Step 1 and alternative splicing(or editing, etc) events. Details see below:
PEGASAS can perform correlation analysis to identify pathway-associated events from the pathway acitivity measurements generated in Step 1 and alternative splicing(or editing, etc.) events. For details of this step, see below:
```
PEGASAS correlation -h
usage: PEGASAS correlation [-h] [-o OUT_DIR] [--GO] [--GO-only]
Expand All @@ -82,27 +81,27 @@ usage: PEGASAS correlation [-h] [-o OUT_DIR] [--GO] [--GO-only]
required arguments:
signatureScorebySample
A TSV format list of gene signature score where each
column is one sample and the corresponding score.
PSIbySample A TSV format matrix of PSI values where each column is
TSV format list of gene signature scores, where each
column is one sample and the corresponding score
PSIbySample TSV format matrix of PSI values where each column is
one sample and each row is one splicing event
groupNameOrder A file contains a comma-separated string of group name
groupNameOrder File containing a comma-separated string of group name
orders. The group name should match group info list in
the pathway score calculation step. This is useful for
the heatmap visualization.
the heatmap visualization
optional arguments:
-h, --help show this help message and exit
-o OUT_DIR, --out-dir OUT_DIR
Output folder name of the analysis.
--GO Perform GO analysis.
--GO-only Only perform GO analysis. Needs to provide background
gene list for p-value calculation.
Name of folder for analysis output
--GO Performs GO analysis
--GO-only Only performs GO analysis. Requires to provide background
gene list for p-value calculation (see -b GO_BACKGROUND_GENE_LIST)
-b GO_BACKGROUND_GENE_LIST, --GO-background-gene-list GO_BACKGROUND_GENE_LIST
Provides background gene list for GO analysis bias
correction. This background list should contain genes
involved in the splicing analysis. Required under GO-
only mode.
correction and should contain genes
participated in the splicing analysis. Required for GO-
only mode
```

Expand All @@ -112,15 +111,15 @@ Go to PEGASAS folder:
```
cd PEGASAS
```
Use hallmarks50-2.gmt.txt as the signature file. It only contains two gene signatures:
Use hallmarks50-2.gmt.txt as the signature file. This file only contains two gene signatures:
```
PEGASAS pathway -o test example/geneExpbySample_example.txt PEGASAS/data/hallmarks50-2.gmt.txt example/groupInfo_example.txt
```
Use the HALLMARK_MYC_TARGETS_V2 signature activity generated in the last step to perform the correlation analysi:
Use the HALLMARK_MYC_TARGETS_V2 signature activity generated in the last step to perform the correlation analysis:
```
PEGASAS correlation -o test --GO test/HALLMARK_MYC_TARGETS_V2/HALLMARK_MYC_TARGETS_V2.scores.txt example/PSIbySample_example.txt example/groupNameOrder_example.txt
```
Results can be found under 'test' folder:
Results can be found under the 'test' folder:
```
4.0K GO/
40K HALLMARK_MYC_TARGETS_V2_background_list.txt
Expand All @@ -133,7 +132,7 @@ Results can be found under 'test' folder:
579 HALLMARK_MYC_TARGETS_V2.sorted.txt
2.4M refinedBySample.PSIbySample_example.HALLMARK_MYC_TARGETS_V2.sorted.txt
```
__HALLMARK_MYC_TARGETS_V2_high_cor_matrix.txt__: Pathway-associated events with pearson r and permutation p-value.
__HALLMARK_MYC_TARGETS_V2_high_cor_matrix.txt__: Pathway-associated events with Pearson's r and permutation p-value.


### Contact
Expand Down

0 comments on commit 5c48196

Please sign in to comment.