LInk mANagemenNt systEm
This application manage the Bioinformatic workflow for "Programma di profilazione genomica dei tumori FPG500
It's designed for the analysis of panel Illumina TruSight Oncology 500 sequenced using the Illumina Novaseq6000 sequencer.
The Variant calling is performed using Illumina Local App and uploaded on PierianDX Clinical Genomics Workspace (CGW) for the variant interpretation. For these steps is compulsory to have an agreement with Illumina for the Lacal App usage and a licensed account on PierianDX.
It is also use Lianne for coverage analysis using only the analysis coverage jobs
All sh scripts are designed for PBS scheduler installed on CentOS operative system, if it doesn't available on your system the scripts can be used deleting the PBS parameters part
Lianne System works in Conda environment, to install Conda follow:
The conda packages to install are listed in conda_packages_list.txt
It also use Singularity v3.7.4 for the Illumina Local App
The coverage analysis use VarHound available on GitHub at:
https://github.com/fernandoPalluzzi/VarHound
Insert the absolute path of the confPath.ini file in the CONF variable at the beginnign of the lianne.py file.
CONF = absolutePath/confPath.ini
usage: lianne.py [-h] -i RUNINPUT [-l1 SELECT] [-l2 NCPUS] [-l3 MEM]
[-e EMAIL] [-m SENDMODE] [-N NAME] [-q QUEUE]
Link Management System - Lianne
optional arguments:
-h, --help show this help message and exit
-i RUNINPUT, --runInput RUNINPUT
NovaSeq output sequencing path
-l_select SELECT, --select SELECT
Select the number of chunks to send on PBS cluser -
Default=1
-l_ncpus NCPUS, --ncpus NCPUS
Select the number of ncpus to require - Default=24
-l_mem MEM, --mem MEM
Select the amount of memory to require - Default=128
-e EMAIL, --email EMAIL
Insert the email -
[email protected]
-m SENDMODE, --sendMode SENDMODE
Insert the sending email mode - Default=ae
-N NAME, --name NAME Insert the job name - Default=lianne
-q QUEUE, --queue QUEUE
Insert the queue to send job - Default=workq
-d, --debug Run the script in debug mode No jobs will be send No
file will be written - Default=False
-f, --fastqc Perform FastQC analysis on fastq files - Default=False
Starting Lianne using the default parameters the scrit needs only the path folder of sequencer output.
The demultiplexing results are stored in a temporary folder with the following name analysis_runName where runName is the name of sequencing output folder
Lianne performs:
-
A check if the samplesheet exists. If the samplesheet not exists, Lianne returns a warning and exit. If the samplesheet has another name, Lianne makes a copy in the temporary using SampleSheet.csv as file name
-
Sends the demultiplexing job using Illumina TruSight Oncology 500 Local App with the following command line:
#! /bin/bash
#PBS -o /temporaryFolder/analysis_runName/stdout_demultiplex
#PBS -e /temporaryFolder/analysis_runName/stderr_demultiplex
#PBS -l select=1:ncpus=24:mem=128g
#PBS -M [email protected]
#PBS -m ae
#PBS -N lianne_demultiplex
#PBS -q workq
module load singularity/3.7.4
module load openmpi/4.1.1
cd /apps/trusight/2.2.0
./TruSight_Oncology_500_RUO.sh \
--analysisFolder /temporaryFolder/analysis_runName/runName \
--resourcesFolder /apps/trusight/2.2.0/resources \
--runFolder /data/novaseq/Diagnostic/NovaSeq/SequencerOutput/runName \
--engine singularity \
--sampleSheet /temporaryFolder/analysis_runName/SampleSheet.csv \
--isNovaSeq \
--demultiplexOnly
The Illumina TruSigth Oncology Local App v2.2.0 is used for the local variant calling analysis.
Lianne sends in queue the Local App using a sh script containing the following command line and PBS parameters:
#! /bin/bash
#PBS -o /data/novaseq_results/runName/stdout_LocalApp
#PBS -e /data/novaseq_results/runName/stderr_LocalApp
#PBS -l select=2:ncpus=24:mem=128g
#PBS -M [email protected]
#PBS -m ae
#PBS -N lianne_LocalApp
#PBS -q workq
module load singularity/3.7.4
module load openmpi/4.1.1
cd /apps/trusight/2.2.0
./TruSight_Oncology_500_RUO.sh \
--analysisFolder /data/novaseq_results/runName \
--resourcesFolder /apps/trusight/2.2.0/resources \
--runFolder /data/novaseq/Diagnostic/NovaSeq/SequencerOutput/runName \
--engine singularity \
--sampleSheet /temporaryFolder/analysis_runName/SampleSheet.csv \
--isNovaSeq
It needs to have a PiarianDX account
#! /bin/bash
#PBS -o /yourTmp/analysis_220314_A01423_0026_AHGGJNDRXY/stdout_cgwUpload
#PBS -e /yourTmp/analysis_220314_A01423_0026_AHGGJNDRXY/stderr_cgwUpload
#PBS -l select=1:ncpus=5:mem=80g
#PBS -M [email protected]
#PBS -m ae
#PBS -N lianne_cgwUpload
#PBS -q workq
module load corretto/8.292.10.1
cd /PathTo/CGWRunUploader/
java -jar -Dloader.main=com.pdx.commandLine.ApplicationCommandLine RunUploader-1.13.jar --commandLine --runFolder=/yourTmp/analysis_220314_A01423_0026_AHGGJNDRXY/220314_A01423_0026_AHGGJNDRXY --sequencer=Illumina --sequencerFileType=fastq
It expects the fastq files in:
/output_analysis/Logs_Intermediates/FastqGeneration/Sample_ID/sampleID.fastq.gz
#! /bin/bash
#PBS -o /yourTmp/analysis_220314_A01423_0026_AHGGJNDRXY/stdout_FastQC
#PBS -e /yourTmp/analysis_220314_A01423_0026_AHGGJNDRXY/stderr_FastQC
#PBS -l select=1:ncpus=10:mem=20g
#PBS -M [email protected]
#PBS -m ae
#PBS -N lianne_FastQC
#PBS -q workq
module load anaconda/3
init bash
source ~/.bashrc
conda activate /PathTo/condaEnv/lianne
/PathTo/lianne/Lmodules/fastqc.py -f /yourResults/220314_A01423_0026_AHGGJNDRXY/Logs_Intermediates/FastqGeneration/*/*.fastq.gz
It expects the bam file as following:
snv: /output_analysis/Logs_Intermediates/StitchedRealigned/Sample_ID/Sample_ID.bam cnv: /output_analysis/Logs_Intermediates/DnaRealignment/Sample_ID/Sample_ID.bam rna: /output_analysis/Logs_Intermediates/RnaMarkDuplicate/Sample_ID/Sample_ID.bam
This analysis is performed using mosdepth package
https://github.com/brentp/mosdepth
#! /bin/bash
#PBS -o /yourResults/220314_A01423_0026_AHGGJNDRXY/stdout_cvLaunch
#PBS -e /yourResults/220314_A01423_0026_AHGGJNDRXY/stderr_cvLaunch
#PBS -l select=1:ncpus=1:mem=1g
#PBS -M [email protected]
#PBS -m ae
#PBS -N lianne_cvLaunch
#PBS -q workq
cd /PathTo/lianne/
python3 Lmodules/cvLaunch.py -o /yourResults/220314_A01423_0026_AHGGJNDRXY -e [email protected] -p snv
python3 Lmodules/cvLaunch.py -o /yourResults/220314_A01423_0026_AHGGJNDRXY -e [email protected] -p rna
python3 Lmodules/cvLaunch.py -o /yourResults/220314_A01423_0026_AHGGJNDRXY -e [email protected] -p cnv
This analysis is performed using R
module load anaconda/3
init bash
source ~/.bashrc
conda activate /PathTo/condaEnv/lianne
cd /yourResults/220314_A01423_0026_AHGGJNDRXY
cd /PathTo/lianne/
python3 VarHound/vhLaunch.py /yourResults/220314_A01423_0026_AHGGJNDRXY/snv_coverage
python3 VarHound/vhLaunch.py /yourResults/220314_A01423_0026_AHGGJNDRXY/rna_coverage
python3 VarHound/vhLaunch.py /yourResults/220314_A01423_0026_AHGGJNDRXY/cnv_coverage