Skip to content

Commit

Permalink
Highlight further documentation for users
Browse files Browse the repository at this point in the history
Also explain origin of name.
  • Loading branch information
veghp committed Dec 6, 2023
1 parent 5f79e6f commit c000854
Showing 1 changed file with 3 additions and 4 deletions.
7 changes: 3 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@

![version](https://img.shields.io/badge/current_version-0.3.1-blue)

Sequencing analysis pipeline (aqueduct) for validating plasmids and DNA assembly constructs, using long reads.
Sequeduct (**seque**ncing aque**duct**) is a long read sequencing data analysis pipeline for validating plasmids and DNA assembly constructs.

An example analysis and demonstration data are available at the [Sequeduct demo](https://github.com/Edinburgh-Genome-Foundry/Sequeduct_demo) site.

## Usage

Expand Down Expand Up @@ -58,7 +60,6 @@ docker pull ghcr.io/edinburgh-genome-foundry/sequeduct:v0.3.1

Use `-profile docker` to use this image in the below commands, instead of `-with-docker sequeduct_local`.


### Run

Create a directory for your project and copy (or link) the FASTQ directories from your Nanopore run (e.g. `fastq_pass`). Specify this together with a sample sheet in your commands:
Expand Down Expand Up @@ -92,7 +93,6 @@ The above commands each output a directory within a created `results` directory.

A more detailed example and demonstration data are available at the [Sequeduct demo](https://github.com/Edinburgh-Genome-Foundry/Sequeduct_demo) site.


### Details

For simplicity, the names in the sample sheet are used for finding the reference Genbank files, therefore sample names must match filenames with a ".gb" extension.
Expand All @@ -105,7 +105,6 @@ For convenience, a script is included to collect plot files from the result dire

The pipeline was designed to work with data from one or more barcodes (FASTQ subdirectories). It has been tested on a desktop machine running Ubuntu 20.04.6 LTS (Memory: 15.5 GiB; CPU: Intel® Core™ i5-6500 CPU @ 3.20GHz × 4), and confirmed to work with up to 96 barcodes. The largest tested dataset was 1.5 GB Nanopore FASTQ data, resulting in 1.1 GB filtered data (100k filtered reads) with up to 55 MB individual filtered FASTQ files (i.e. per sample). If the dataset is much larger, then it may return an error at the variant call or another step. A recommended solution is to increase the quality cutoff (with parameter `--quality_cutoff`), and optionally the minimum length cutoff (`--min_length`), to work with fewer but better reads.


## License = GPLv3+

Copyright 2021 Edinburgh Genome Foundry, University of Edinburgh
Expand Down

0 comments on commit c000854

Please sign in to comment.