Skip to content

Novel basecaller, jointly processing raw and event data from ONT nanopore, basec on encoder-decoder + attention architecture

License

Notifications You must be signed in to change notification settings

adamnapieralski/ravvent-basecaller

Repository files navigation

ravvent-basecaller

Basecaller, called Ravvent, using joint raw and event data sequence-to-sequence processing. Incorporating an encoder-decoder architecture with attention mechanism and LSTMs as RNNs.

Structure

  • data_loader.py - includes DataGenerator class responsible for loading simulated and real data, then preprocessing it into final batches, consumable by model
  • basecaller.py - includes Encoder, Decoder classes, and general Basecaller class, combining all layers into single keras Model responsible for learning and performing basecalling task
  • utils.py - includes various utilities functions
  • ravvent.py - sample script for running learning model pipeline
  • ravvent_mapping_evaluator.py, ravvent_performance_evaluator.py - scripts for performing evaluation (read accuracy, speed)

Data availability

Simulated

Simulated datasets were generated using DeepSimulator tool. Script used in this purpose is generate_simulator_reduced.py, where the parameters for execution of DeepSimulator, as well as event_detection, can be found. Fasta files are stored in data/simulator/reduced directory.

Real

Real data source is supporting data for Chiron basecaller, that is available here.

Environment

Prerequisites

About

Novel basecaller, jointly processing raw and event data from ONT nanopore, basec on encoder-decoder + attention architecture

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published