Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AS events of a single sample #418

Open
biochristmas opened this issue Jul 14, 2024 · 7 comments
Open

AS events of a single sample #418

biochristmas opened this issue Jul 14, 2024 · 7 comments

Comments

@biochristmas
Copy link

Hi, If I only need to obtain the AS count for a single sample (experimental group or control group), I execute this command: rmats.py --b1 b2.txt --gtf isoquant_gffcompare.gtf -t paired --bi ./STAR_binary_index --readLength 150 --nthread 30 --od ./test1 --libType fr-unstranded --statoff --tmp ./test1/tmp_output. The AS count in the resulting summary.txt file represents the number of alternative splicing events for that sample?
summary

@EricKutschera
Copy link
Contributor

The TotalEvents columns in the summary file are the total number of events in the MATS output files: #257 (comment)

Since you only have --b1 without --b2 the events are not filtered and could be from the --gtf without supporting reads: https://groups.google.com/g/rmats-user-group/c/SVlOJs4CwVk/m/IIzvDAXrBgAJ

You could get an AS event count by filtering on the read count (IJC_SAMPLE_1, SJC_SAMPLE_1) and PSI (IncLevel1) columns: #183 (comment)

@biochristmas
Copy link
Author

Hi, @EricKutschera I will filter based on Average PSI (IncLevel) within 0.05 and 0.95. I have three replicates, as an example (first row of the image), Average PSI (IncLevel) = (NA+1.0+0.0)/3. However, there are NA values present. How should I handle this。
SE MATS JC

@EricKutschera
Copy link
Contributor

You can try different methods for handling NA. One option is to filter out events that have any NA value in IncLevel1. It looks like only a few rows in the screenshot have NA IncLevel1. You could also keep events with only 1 NA value in IncLevel1 and average the remaining values. In the example you mentioned you could use (1 + 0)/2=0.5 as the average PSI

@biochristmas
Copy link
Author

Hi, @EricKutschera
I have another question. IncLevel1 = SJC_SAMPLE_1 / (SJC_SAMPLE_1 + IJC_SAMPLE_2). For example, in the second row of the image, the value for the third replicate should be 0.06: 2/(2+31)=0.06. However, IncLevel1 = 0.037.
SE MATS JC_2

@EricKutschera
Copy link
Contributor

IncLevel1 is (IJC_SAMPLE_1/IncFormLen) / ((IJC_SAMPLE_1/IncFormLen) + (SJC_SAMPLE_1/SkipFormLen)) : #349

Your screenshot doesn't show the IncFormLen and SkipFormLen columns for that event, but based on the coordinates and --readLength 150 they should be 252 and 149. Then IncLevel=(2/252)/((2/252)+(31/149)) = 0.0367

@biochristmas
Copy link
Author

@EricKutschera
Thank you for your prompt response. The issue has been resolved. Best regards and good luck!

@EricKutschera
Copy link
Contributor

Yes rMATS can detect events just from the GTF file. See this comment which has a link to an empty bam file: #79 (comment)

You could find the set of annotated events by running rmats with a GTF file and with a mostly empty input

Based on the screenshot I think rMATS will only detect 2 RI events. See this post for details about the intron detection code: #17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants