-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generating proteogenomic database for Pseudomonas with VCF called from WGS (or exome seq) data #185
Comments
You want to do Spritz for Pseudomonas? |
Spritz is currently built to call variants from eukaryotes with RNA-Seq data, so this would take a new workflow. What type of sequencing data do you have for the sample (e.g. exome, genome)? Here's the ensembl genome for Pseudomonas: http:https://bacteria.ensembl.org/Pseudomonas_aeruginosa_pao1/Info/Index. There's no reference VCF like we're using for human in GATK. |
We would also need to implement using other codon tables for this feature #164 |
I have WGS data for this bacteria which seems to have diverged from main based on assembly so using canonical proteome is clearly suboptimal. I see that GFF is available at ftp:https://ftp.ensemblgenomes.org/pub/bacteria/current/gff3/bacteria_67_collection/pseudomonas_aeruginosa/ , probably one can use it to call the variants and create a strain-specific VCF ? |
This is definitely a good direction to take Spritz. It's also good that the GFF file is available. I know @rmmiller22 was working on vervet monkey samples, which had that situation, i.e. no reference VCF available. I unfortunately don't have the bandwidth to add this feature to Spritz right now, but we'll keep you posted as we work towards this goal. By the way, what tool do you typically use to align WGS reads to bacterial genomes? Bowtie/BWA? |
Oh, an option in the meantime is that you could generate a VCF file for your sample using other means and run it through the custom SnpEff fork that is part of Spritz with the options |
I am wondering how can i add something like Pseudomonas aeruginosa ?
The fasta file for the reference proteome is available at https://www.uniprot.org/proteomes/UP000002438 , any ideas on how to proceed will be appreciated :)
The text was updated successfully, but these errors were encountered: