You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am re-running a WGBS pipeline to see how well it can be replicated with the partial code i have.
I am no stuck but I want the code to be more efficient and not wait for me until i always initiate to continue with the next sample(pair) after one sample was aligned. So I use following script:
_echo "Bismark aligning"
input_files_1=""
input_files_2=""
for file in fastq/trim/_R1_001_val_1.fq.gz; do
input_files_1+="${file},"
done
for file in fastq/trim/_R2_001_val_2.fq.gz; do
input_files_2+="${file},"
done
input_files_1=${input_files_1%,} # Remove the trailing comma
input_files_2=${input_files_2%,}
the input_files_1 variable would then have following sample names saved (comma separated as requested in the bismark --help): fastq/trim/Ctrl-1_R1_001_val_1.fq.gz,fastq/trim/Ctrl-2_R1_001_val_1.fq.gz,fastq/trim/F1-1_R1_001_val_1.fq.gz,fastq/trim/F1-2_R1_001_val_1.fq.gz
according to the first lines after starting the alignment everything seems to be fine as all fastq files were detected: Input files to be analysed (in current folder '/home/chuddy/bioinformatics/lamarck-project'):
fastq/trim/Ctrl-1_R1_001_val_1.fq.gz
fastq/trim/Ctrl-1_R2_001_val_2.fq.gz
fastq/trim/Ctrl-2_R1_001_val_1.fq.gz
fastq/trim/Ctrl-2_R2_001_val_2.fq.gz
fastq/trim/F1-1_R1_001_val_1.fq.gz
fastq/trim/F1-1_R2_001_val_2.fq.gz
fastq/trim/F1-2_R1_001_val_1.fq.gz
fastq/trim/F1-2_R2_001_val_2.fq.gz
Library is assumed to be strand-specific (directional), alignments to strands complementary to the original top or bottom strands will be ignored (i.e. not performed!)
After 887 minutes of running time, i received a bam file, which looked okay, also according the detection of C in CpG context, etc.
What did I do wrong, since normally the alignment of the second sample should start immediately after the first finished? Since 887 minutes is a long time, I wonder how i can speed things up?
I have difficulties estimating what my mobile workstation is capable of carrying out. I used parallel 3 to be on the save side, although I have 24 CPUs and approx 62 GB of memory. I am working with the mouse genome (mm10, from ensembl).
Bismark Version: v0.24.1
bowties2 version 2.5.1
If anything else is needed to help me, pls tell me so and i will happily deliver.
Best
Tom
The text was updated successfully, but these errors were encountered:
Dear colleagues,
I am re-running a WGBS pipeline to see how well it can be replicated with the partial code i have.
I am no stuck but I want the code to be more efficient and not wait for me until i always initiate to continue with the next sample(pair) after one sample was aligned. So I use following script:
_echo "Bismark aligning"
input_files_1=""
input_files_2=""
for file in fastq/trim/_R1_001_val_1.fq.gz; do
input_files_1+="${file},"
done
for file in fastq/trim/_R2_001_val_2.fq.gz; do
input_files_2+="${file},"
done
input_files_1=${input_files_1%,} # Remove the trailing comma
input_files_2=${input_files_2%,}
bismark --genome ~/bioinformatics/ref_genomes/mouse_38/genome
-1 "${input_files_1}" -2 "${input_files_2}"
-o BAM/prededuplicate/ --temp_dir BAM/
--parallel 3 -q --score_min L,0,-0.2 --maxins 500_
the input_files_1 variable would then have following sample names saved (comma separated as requested in the bismark --help):
fastq/trim/Ctrl-1_R1_001_val_1.fq.gz,fastq/trim/Ctrl-2_R1_001_val_1.fq.gz,fastq/trim/F1-1_R1_001_val_1.fq.gz,fastq/trim/F1-2_R1_001_val_1.fq.gz
according to the first lines after starting the alignment everything seems to be fine as all fastq files were detected:
Input files to be analysed (in current folder '/home/chuddy/bioinformatics/lamarck-project'):
fastq/trim/Ctrl-1_R1_001_val_1.fq.gz
fastq/trim/Ctrl-1_R2_001_val_2.fq.gz
fastq/trim/Ctrl-2_R1_001_val_1.fq.gz
fastq/trim/Ctrl-2_R2_001_val_2.fq.gz
fastq/trim/F1-1_R1_001_val_1.fq.gz
fastq/trim/F1-1_R2_001_val_2.fq.gz
fastq/trim/F1-2_R1_001_val_1.fq.gz
fastq/trim/F1-2_R2_001_val_2.fq.gz
Library is assumed to be strand-specific (directional), alignments to strands complementary to the original top or bottom strands will be ignored (i.e. not performed!)
After 887 minutes of running time, i received a bam file, which looked okay, also according the detection of C in CpG context, etc.
What did I do wrong, since normally the alignment of the second sample should start immediately after the first finished?
Since 887 minutes is a long time, I wonder how i can speed things up?
I have difficulties estimating what my mobile workstation is capable of carrying out. I used parallel 3 to be on the save side, although I have 24 CPUs and approx 62 GB of memory. I am working with the mouse genome (mm10, from ensembl).
Bismark Version: v0.24.1
bowties2 version 2.5.1
If anything else is needed to help me, pls tell me so and i will happily deliver.
Best
Tom
The text was updated successfully, but these errors were encountered: