Skip to content
/ sine Public

TE annotation pipeline for scanning SINE elements

Notifications You must be signed in to change notification settings

Kapeel/sine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

SINE background

SINEs are transcribed by RNA polymerase III, and are derived from one of three classes of Pol III–transcribed molecules (tRNA, 7SL, 5s rRNA). While animal SINEs from all three classes are known, plant SINEs are exclusively derived from tRNA. To find SINEs, I apply the implementation of Wenke et al. 2011 in SINE-finder, which searches for tRNA-derived SINEs containing RNA polymerase III A and B boxes near the polyA tail. The defaults are that A and B box consensus nucleotide sequences are RVTGG and GTTCRA, there is a 25–50 bp spacer between the A and B boxes, and there is a spacer of 20–500 bp between the B box and polyA tail.

How to identify SINEs

The script run_sines.sh will download sine_finder, run sine_finder, parse results, assign to families, and output a GFF. Each candidate SINE is clustered using VSEARCH and silix, to characterize families, and added to existing families if specified in the config file. Each family is also matched to Maize TE Consortium exemplars (via 80-80-80 identity) to faciliatate comparison between genome versions.

Output

  • ${GENOMENAME}.RST.gff3 is all SINEs identified.
  • ${GENOMENAME}.RST.tabout contains extended information, like the TSD length and mismatch
  • ${GENOMENAME}.RST.fa is all the SINEs identified, renamed with their TEID, in fasta format.
  • post-${GENOMENAME}.existingRST.fa includes all previously identified copies with the newly identified ones. These need to be deposited somewhere, so the next annotator can use them to add to existing families. Jeff and I have talked about maizeGDB hosting these via a link so somebody annotating can download existing and upload when finished.

About

TE annotation pipeline for scanning SINE elements

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published