Skip to content


Repository files navigation

Sequeduct logo



Sequencing analysis pipeline (aqueduct) for validating plasmids and DNA assembly constructs, using long reads.



Install Nextflow and Docker.

Pull the Nextflow pipeline:

nextflow pull edinburgh-genome-foundry/Sequeduct -r v0.2.3

Pull the Docker image that contains the required software (requires access to EGF's container repo):

docker pull

Alternatively, build the image locally from the cloned repo:

docker build . -f containers/Dockerfile --tag sequeduct_local


Create a directory for your project and copy (or link) the FASTQ directories from your Nanopore run (e.g. into fastq). Specify this together with a sample sheet in your commands:

# Preview
nextflow run edinburgh-genome-foundry/Sequeduct -r v0.2.3 -entry preview --fastq_dir='fastq' \
    --reference_dir='genbank' \
    --sample_sheet='sample_sheet.csv' \
    -profile docker
# Analysis
nextflow run edinburgh-genome-foundry/Sequeduct -r v0.2.3 -entry analysis --fastq_dir='fastq' \
    --reference_dir='genbank' \
    --sample_sheet='sample_sheet.csv' \
    --projectname='EGF project' \
    -profile docker
# Review
nextflow run edinburgh-genome-foundry/Sequeduct -r v0.2.3 -entry review --reference_dir='genbank' \
    --results_csv='results_sheet.csv' \
    --projectname='EGF project review' \
    --all_parts='parts_fasta/part_sequences.fasta' \
    --assembly_plan='assembly_plan.csv' \
    -profile docker
# De novo assembly
nextflow run edinburgh-genome-foundry/Sequeduct -r v0.2.3 -entry assembly --fastq_dir='fastq' \
    --results_csv='assembly_sheet.csv' \
    -profile docker 

The above commands each output a directory within results. Similarly, Nextflow creates and uses a directory named work, so ensure that your project directory doesn't have one. Specify revision of the project with -r (a git branch or tag), and choose a configuration profile (with -profile). Profiles are specified in the Nextflow config files.

Use -with-docker sequeduct_local to use a locally built Docker image (instead of -profile docker).


For simplicity, the names in the sample sheet are used for finding the reference Genbank files, therefore sample names must match filenames with a ".gb" extension.

Note that canu v2.2 requires minimum 100 reads, otherwise it returns an error. A fix has been posted, but it's not released yet.

For convenience, a script is included to collect plot files from the result directories (bin/

License = GPLv3+

Copyright 2021 Edinburgh Genome Foundry, University of Edinburgh

Sequeduct was written at the Edinburgh Genome Foundry by Peter Vegh, and is released under the GPLv3 license.