GEMmaker is a Nextflow workflow for large-scale gene expression sample processing, expression-level quantification and Gene Expression Matrix (GEM) construction. Results from GEMmaker are useful for differential gene expression (DGE) and gene co-expression network (GCN) analyses. The GEMmaker workflow currently supports Illumina RNA-seq datasets..
Please see the GEMmaker documentation for in-depth instructions for running GEMmaker.
GEMmaker (i.e. systemsgenetics/gemmaker) is a pipeline for quantification of Illumina RNA-seq data. Users can choose from Hisat2, STAR, Kallisto or Salmon. It can process locally stored FASTQ files or automatically retrieve them from NCBI's SRA. The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.
GEMmaker is an nf-core compatible workflow, however, GEMmaker is not an official nf-core workflow. This is because nf-core offers the nf-core/rnaseq workflow which is an excellent workflow for RNA-seq analysis that provides similar functionality to GEMmaker. However, GEMmaker is different in that it can scale to thousands of samples without exceeding local storage resources by running samples in batches and removing intermediate files. It can do the same for smaller sample sets on machines with less computational resources. This ability to scale is a unique adaption that is currently not provided by Nextflow. When Nextflow does provide support for batching and scaling, the nf-core/rnaseq will be updated and GEMmaker will probably be retired in favor of the nf-core workflow. Until then, if you are limited by storage GEMmaker can help! v
Please see the list of developers who have contributed to this repository.
Development of GEMmaker was funded by the U.S. National Science Foundation Award #1659300.
If you use GEMmaker in your research, please use this citation:
Hadish, J. A., Biggs, T. D., Shealy, B. T., Bender, M. R., McKnight, C. B., Wytko, C., Smith, M. C., Feltus, F. A., Honaas, L., & Ficklin, S. P. (2022). GEMmaker: process massive RNA-seq datasets on heterogeneous computational infrastructure. BMC Bioinformatics, 23(1), 1–11.
If you would like to contribute to this pipeline, please see the contributing guidelines.
Please follow the instructions in the 'Online Documentation'