Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Few questions regarding rMATS #400

Open
tanya-lasagne opened this issue May 7, 2024 · 5 comments
Open

Few questions regarding rMATS #400

tanya-lasagne opened this issue May 7, 2024 · 5 comments

Comments

@tanya-lasagne
Copy link

Hello rMATS community,
I'm new to alternative splicing (AS) analysis and seeking guidance on a few topics. Any help would be greatly appreciated!

  1. When running a paired analysis with the --paired-stats flag, should I focus on candidate genes with low p-values? I'm currently comparing soybeans grown under optimal conditions vs. deviated conditions.

  2. Does rMATS calculate the percentage or likelihood of a splicing variant occurring within the specified conditions? Or perhaps provide any metrics for the frequency or abundance of these variants?

Thank you!

@EricKutschera
Copy link
Contributor

Selecting events with a low p-value is reasonable. This post has some suggestions about cutoffs: #320

The IncLevel columns (IncLevel1, IncLevel2) are PSI values (Percent Spliced In). The IncLevel is the proportion of the inclusion isoform found for each event and the inclusion isoform is shown in the README: https://github.com/Xinglab/rmats-turbo/tree/v4.3.0?tab=readme-ov-file#output

The columns like IJC_SAMPLE_1 and SJC_SAMPLE_1 give the supporting read counts for each isoform

@tanya-lasagne
Copy link
Author

Thank you Eric!
Last question, I also noticed that some FDR and p-values are calculated as 0, should I omit those or do you think rMATS is detecting extreme significance in those cases?

@EricKutschera
Copy link
Contributor

Zero values for FDR or p-value should be interpreted as very significant. The software has some numerical limits and very small values become zero. Here's a related post: https://groups.google.com/g/rmats-user-group/c/TW534af62fg/m/tZXBs0Y4BAAJ

@tanya-lasagne
Copy link
Author

tanya-lasagne commented May 17, 2024

Hi Eric,
Thanks again for your assistance! I have a few follow-up questions:

1)When analyzing skipped exon events using rMATS, how can I precisely identify which exon is skipped (e.g., exon 3) within each gene (e.g., gene A)? Are there specific output files or columns that indicate this information?

2)Does rMATS produce gene expression data, such as read counts or FPKM values, for the splicing variants detected? I aim to integrate this data into a gene network inference software to predict gene-to-gene interactions and understand how splicing variants affect gene regulation. If rMATS doesn't provide this information, are there other tools or pipelines you recommend for obtaining splicing variant expression data?

@EricKutschera
Copy link
Contributor

In files like SE.MATS.JC.txt the columns exonStart_0base and exonEnd give the coordinates of the exon being skipped:
https://github.com/Xinglab/rmats-turbo/tree/v4.3.0?tab=readme-ov-file#output

rMATS doesn't output gene expression. It outputs the counts of reads that support the inclusion and skipping isoform of each event in columns like IJC_SAMPLE_1. Potentially kallisto could output what you are looking for: https://pachterlab.github.io/kallisto/manual

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants