-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enabling GVCF file square off when starting from existing RNAseq BAM and GVCF files #3355
Comments
@roryk I had a look myself and I am making some progress. I currently modified the following function as shown below and this results in the input GVCF files being squared off. Original
New
Complete modified function:
Later the joint RNAseq analsys from GVCF crashes though.
I'll also have a look at fixing this or working around it. Let me know if you have any tips. After checking that joint analysis from GVCF works, and normal analysis from FASTQ still works, I am happy to submit a pull request with the changes. Thank you. Wim |
Edits on 2 more places were needed to allow for setting data["vrn_file_orig"] and for running soft filtering. bcbio-nextgen/bcbio/variation/joint.py Line 201 in dd79b71
bcbio-nextgen/bcbio/pipeline/rnaseq.py Line 152 in a2f8fa0
The RNAseq analysis from GVCF now works and finishes. I did not yet try to pass in a splice site file computed from the reference genome gene model, but I think this should work and be reasonable comparable to using the splice sites identified by STAR. Will send you a pull request later this week with the few lines that I needed to change. |
Created a simplified pull request in #3361 |
Version info
bcbio_nextgen.py --version
): 1.1.5To Reproduce
Exact bcbio command you have used:
Your sample configuration file:
Repeated for many RNAseq samples.
Observed behavior
Per sample:
Over all samples:
Expected behavior
The same as the observed behavior. With the important additions of:
The squared off multi-sample VCF file is made when starting with existing BAM and GVCF files for whole genome sequencing data. #2336
It would be nice if bcbio could do the same for existing RNAseq GVCF files.
Could you please have a look. Or point me to the direction were I would need to edit the bcbio RNAseq code to run the square off of the GVCF files. And indicate if there are any likely complications that would come up when enabling this.
Thank you.
Wim
The text was updated successfully, but these errors were encountered: