Skip to content

A set of genome assembly exercises for the ONT education platform

Notifications You must be signed in to change notification settings

demharters/assemblyTutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Genome assembly tutorial

This tutorial is aimed at researchers with little background in bioinformatics who would like
to start learning about genome assembly. Basic knowledge of the Linux command line is required
(for an introduction see http:https://linuxcommand.org).

Overview

In whole genome sequencing (WGS) researchers are usually interested in the original genomic sequence of their sample. However, due to fragmentation of the genome during the preparation of the DNA sequencing library the order of the individual fragments is lost. Thus, the sequenced fragments (reads) need to be correctly stitched back together into their original configuration, a process called genome assembly.

This tutorial explains the two main approaches to genome assembly: 1) The alignment (or mapping) of reads to a reference sequence and 2) the reconstruction of the genomic sequence without a reference (de-novo assembly).

Motivation

A researcher observes a known bacterial strain with an unusual phenotype. He/she would like to sequence the genome to identify the responsible genetic change. Genome assembly using reference alignment would allow for the identification of small alterations in the sequence such as single nucleotide polymorphisms (SNPs), insertions or deletions. However, larger alterations such as duplication events that are not in our reference sequence would be lost. A better approach for detecting these kinds of new structural variations is de-novo assembly. This method requires no prior knowledge of the original sequence but instead attempts to reconstruct the genome from the reads only. Both alignment and assembly have their pros and cons and they often go hand-in-hand during genome analysis.

Objectives:

Provide ..

  • a basic understanding of genome assembly
  • a workflow for assembly by alignment
  • a workflow for de-novo assembly

Tutorial structure

  1. Introduction to genome assembly
  2. Assembly by alignment - workflow example
  3. De-novo assembly - workflow example

Dataset

If you haven't got your own data you may use this dataset and this reference sequence.

Data formats

For a short description of the data formats, see here.

Further reading:

About

A set of genome assembly exercises for the ONT education platform

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published