Skip to content
/ MITObim Public
forked from chrishah/MITObim

MITObim - mitochondrial baiting and iterative mapping

License

Notifications You must be signed in to change notification settings

eernst/MITObim

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MITObim - mitochondrial baiting and iterative mapping

VERSION

1.3

Copyright 2012 - Christoph Hahn

CONTACT

Christoph Hahn [email protected]

INTRODUCTION

This document contains instructions on how to use the MITObim pipeline described in the manuscript "Reconstructing mitochondrial genomes directly from genomic next generation sequencing reads - a baiting and iterative mapping approach" by Hahn et al., submitted to NAR methods online. The pipeline is at the moment intended to be used with illumina data, but can be readily modified for the use with other platforms data. The latest version of the wrapper script (including the proofreading function) has been uploaded 31.12.2012. The tutorial will be updated accordingly early 2013.

PREREQUISITES

General introduction to MITObim

The MITObim procedure (mitochondrial baiting and iterative mapping) represents a highly efficient approach to assembling novel mitochondrial genomes of non-model organisms directly from total genomic DNA derived NGS reads. Labor intensive long-range PCR steps prior to sequencing are no longer required. MITObim is capable of reconstructing mitochondrial genomes without the need of a reference genome of the targeted species by relying solely on (a) mitochondrial genome information of more distantly related taxa or (b) short mitochondrial barcoding sequences (seeds), such as the commonly used cytochochrome-oxidase subunit 1 (COI), as a starting reference.

The script is performing three steps and iteratively repeating them: (i) Deriving reference sequence from previous mapping assembly, (ii) in silico baiting using the newly derived reference (iii) previously fished reads are mapped to the newly derived reference leading to an extension of the reference sequence. For more details please refer to the manuscript. Detailed examples are demonstrated in the TUTORIALS section below.

TUTORIALS

The following tutorials are designed for users with little Unix and no previous MIRA experience. Tutorials I & II will demonstrate how to recover the complete mitochondrial genome of Thymallus thymallus using the mitochondrial genome of Salvelinus alpinus as a starting reference. Tutorial III achieves the same goal using solely a ~700 bp barcoding sequence as initial seed reference. Tutorial IV (to be finished early 2013) uses a proofreading procedure to specifically reconstruct two mitochondrial genomes from a mixed sample containing genomic reads from two species.

Preparations:

  • download the MITObim wrapper script MITObim.pl and make it executable (chmod a+x MITObim.pl)
  • download testdata1.tgz and testdata2.tgz and extract the contents (tar xvfz testdata?.tgz)

Test the wrapper script by doing:

-bash-4.1$ ~/PATH/TO/MITObim.pl

which should display the usage:

usage: ./MITObim.pl <parameters>

parameters:

            -start <int>            iteration to start with, default=1
            -end <int>              iteration to end with, default=1
            -strain <string>        strainname as used in initial MIRA assembly
            -ref <string>           referencename as used in initial MIRA assembly
            -readpool <PATH>        path to readpool in fastq format
            -maf <PATH>             path to maf file from previous MIRA assembly


optional:

            --denovo                runs MIRA in denovo mode, default: mapping
            --pair                  finds pairs after baiting, default: no
            --quick <PATH>          starts process with initial baiting using provided fasta reference
            --noshow