R error during counting step before demultiplexing. #393

dfmoralesb · 2024-04-16T14:07:12Z

Error in R during counting step

I'm trying to run zUMIs to demultiplex my data. Everything runs fine until it tries to split the files after running subread in the "counting" step.

Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'strsplit': object 'GE' not found

The "filtering" step ran without any issue, so below is the Yaml and output of when I tried to run it again from the "mapping' step.

Thanks for any advice or suggestions!

YAML:

###########################################
#Welcome to zUMIs
#below, please fill the mandatory inputs
#We expect full paths for all files.
###########################################

#define a project name that will be used to name output files
project: Sample_23L001690

#Sequencing File Inputs:
sequence_files:
file1:
name: /home/morales/julia/Project_633_Eckstein_Werth_Silke/Sample_23L001690/23L001690_S1_L001_R1_001.fastq.gz
base_definition:
- BC(1-12)
- UMI(13-28)
file2:
name: /home/morales/julia/Project_633_Eckstein_Werth_Silke/Sample_23L001690/23L001690_S1_L001_R2_001.fastq.gz
base_definition:
- cDNA(1-94)

#reference genome setup
reference:
STAR_index: /home/morales/julia/Project_633_Eckstein_Werth_Silke/Lobpul1/star_index_exons_only
GTF_file: /home/morales/julia/Project_633_Eckstein_Werth_Silke/Lobpul1/Lobpul1_GeneCatalog_20170213.only_exons.gft
exon_extension: no
extension_length: 0
scaffold_length_min: 0
additional_files: ~
additional_STAR_params: ~

#output directory
out_dir: /home/morales/julia/Project_633_Eckstein_Werth_Silke/Sample_23L001690_zumi

###########################################
#below, you may optionally change default parameters
###########################################

#number of processors to use
num_threads: 100
mem_limit: 500

#barcode & UMI filtering options
#number of bases under the base quality cutoff that should be filtered out.
#Phred score base-cutoff for quality control.
filter_cutoffs:
BC_filter:
num_bases: 1
phred: 20
UMI_filter:
num_bases: 1
phred: 20

#Options for Barcode handling
barcodes:
barcode_num: 10
barcode_file: /home/morales/julia/Project_633_Eckstein_Werth_Silke/bc_batch1_zumi_10.txt
barcode_sharing: null
automatic: no
BarcodeBinning: 1
nReadsperCell: 100
demultiplex: yes

#Options related to counting of reads towards expression profiles
counting_opts:
introns: yes
intronProb: no
downsampling: 0
strand: 0
Ham_Dist: 0
velocyto: no
primaryHit: yes
multi_overlap: no
fraction_overlap: 0
twoPass: yes
#write_ham: yes

#produce stats files and plots?
make_stats: yes

#Start zUMIs from stage. Possible TEXT(Filtering, Mapping, Counting, Summarising). Default: Filtering.
which_Stage: Mapping

#define dependencies program paths
samtools_exec: samtools #samtools executable
Rscript_exec: Rscript #Rscript executable
STAR_exec: STAR #STAR executable
pigz_exec: pigz #pigz executable

#below, fqfilter will add a read_layout flag defining SE or PE
zUMIs_directory: /home/morales/Apps/zUMIs
read_layout: SE

Standard output:

You provided these parameters:
YAML file: zUMIs.yaml
zUMIs directory: /home/morales/Apps/zUMIs
STAR executable STAR
samtools executable samtools
pigz executable pigz
Rscript executable Rscript
RAM limit: 500
zUMIs version 2.9.7e

Sat 13 Apr 2024 05:26:31 PM CEST
WARNING: The STAR version used for mapping is 2.7.11b and the STAR index was created using the version 2.7.4a. This may lead to an error while mapping. If you encounter any errors at the mapping stage, please make sure to create the STAR index using STAR 2.7.11b.
Mapping...
[1] "2024-04-13 17:26:32 CEST"
Warning message:
NAs introduced by coercion
STAR --readFilesCommand samtools view -@ 2 --outSAMmultNmax 1 --outFilterMultimapNmax 50 --outSAMunmapped Within --outSAMtype BAM Unsorted --quantMode TranscriptomeSAM --genomeDir /home/morales/julia/Project_633_Eckstein_Werth_Silke/Lobpul1/star_index_exons_only --sjdbGTFfile /home/morales/julia/Project_633_Eckstein_Werth_Silke/Lobpul1/Lobpul1_GeneCatalog_20170213.only_exons.gft --runThreadN 98 --sjdbOverhang 93 --readFilesType SAM SE --twopassMode Basic --readFilesIn /home/morales/julia/Project_633_Eckstein_Werth_Silke/Sample_23L001690_zumi/Sample_23L001690.filtered.tagged.unmapped.bam --outFileNamePrefix /home/morales/julia/Project_633_Eckstein_Werth_Silke/Sample_23L001690_zumi/Sample_23L001690.filtered.tagged.
STAR version: 2.7.11b compiled: 2024-02-23T15:55:51+01:00 :/home/morales/Apps/STAR-2.7.11b/source
Apr 13 17:26:32 ..... started STAR run
Apr 13 17:26:32 ..... loading genome
Apr 13 17:26:33 ..... processing annotations GTF
Apr 13 17:26:33 ..... started 1st pass mapping
Apr 13 17:51:07 ..... finished 1st pass mapping
Apr 13 17:51:07 ..... inserting junctions into the genome indices
Apr 13 17:51:23 ..... started mapping
Apr 13 18:44:32 ..... finished mapping
Apr 13 18:44:33 ..... finished successfully
Sat 13 Apr 2024 06:44:35 PM CEST
Counting...
[1] "2024-04-13 18:44:44 CEST"
[1] "2e+09 Reads per chunk"
[1] "Loading reference annotation from:"
[1] "/home/morales/julia/Project_633_Eckstein_Werth_Silke/Sample_23L001690_zumi/Sample_23L001690.final_annot.gtf"
[1] "Annotation loaded!"
Warning message:
In dplyr::left_join(intron.saf, unique(exon.saf[, c("GeneID", "Strand")]), :
Detected an unexpected many-to-many relationship between x and y.
ℹ Row 1 of x matches multiple rows in y.
ℹ Row 1 of y matches multiple rows in x.
ℹ If a many-to-many relationship is expected, set relationship = "many-to-many" to silence this warning.
[1] "Assigning reads to features (ex)"

    ==========     _____ _    _ ____  _____  ______          _____
    =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
      =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
        ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
          ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
    ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
   Rsubread 1.32.4

//========================== featureCounts setting ===========================\
|| ||
|| Input files : 1 BAM file ||
|| S Sample_23L001690.filtered.tagged.Aligned.o ... ||
|| ||
|| Annotation : R data.frame ||
|| Assignment details : <input_file>.featureCounts.bam ||
|| (Note that files are saved to the output directory) ||
|| ||
|| Dir for temp files : . ||
|| Threads : 64 ||
|| Level : meta-feature level ||
|| Paired-end : yes ||
|| Multimapping reads : counted ||
|| Multiple alignments : primary alignment only ||
|| Multi-overlapping reads : not counted ||
|| Min overlapping bases : 1 ||
|| ||
|| Chimeric reads : not counted ||
|| Both ends mapped : not required ||
|| ||
\===================== https://subread.sourceforge.net/ ======================https://

//================================= Running ==================================\
|| ||
|| Load annotation file .Rsubread_UserProvidedAnnotation_pid2382952 ... ||
|| Features : 45875 ||
|| Meta-features : 1 ||
|| Chromosomes/contigs : 1802 ||
|| ||
|| Process BAM file Sample_23L001690.filtered.tagged.Aligned.out.bam... ||
|| Single-end reads are included. ||
|| Assign alignments to features... ||
|| Total alignments : 420523607 ||
|| Successfully assigned alignments : 109655322 (26.1%) ||
|| Running time : 1.14 minutes ||
|| ||
|| ||
\===================== https://subread.sourceforge.net/ ======================https://

[1] "Assigning reads to features (in)"

    ==========     _____ _    _ ____  _____  ______          _____
    =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
      =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
        ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
          ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
    ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
   Rsubread 1.32.4

//========================== featureCounts setting ===========================\
|| ||
|| Input files : 1 BAM file ||
|| S Sample_23L001690.filtered.tagged.Aligned.o ... ||
|| ||
|| Annotation : R data.frame ||
|| Assignment details : <input_file>.featureCounts.bam ||
|| (Note that files are saved to the output directory) ||
|| ||
|| Dir for temp files : . ||
|| Threads : 64 ||
|| Level : meta-feature level ||
|| Paired-end : yes ||
|| Multimapping reads : counted ||
|| Multiple alignments : primary alignment only ||
|| Multi-overlapping reads : not counted ||
|| Min overlapping bases : 1 ||
|| ||
|| Chimeric reads : not counted ||
|| Both ends mapped : not required ||
|| ||
\===================== https://subread.sourceforge.net/ ======================https://

//================================= Running ==================================\
|| ||
|| Load annotation file .Rsubread_UserProvidedAnnotation_pid2382952 ... ||
|| Features : 86534 ||
|| Meta-features : 1 ||
|| Chromosomes/contigs : 1740 ||
|| ||
|| Process BAM file Sample_23L001690.filtered.tagged.Aligned.out.bam.ex.f ... ||
|| Single-end reads are included. ||
|| Assign alignments to features... ||
|| Total alignments : 420523607 ||
|| Successfully assigned alignments : 119436529 (28.4%) ||
|| Running time : 1.20 minutes ||
|| ||
|| ||
\===================== https://subread.sourceforge.net/ ======================https://

[1] "2024-04-13 18:50:16 CEST"
[1] "Coordinate sorting final bam file..."
[bam_sort_core] merging from 0 files and 100 in-memory blocks...
[1] "2024-04-13 18:55:20 CEST"
[1] "Here are the detected subsampling options:"
[1] "Automatic downsampling"
[1] "Working on barcode chunk 1 out of 1"
[1] "Processing 10 barcodes in this chunk..."
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'strsplit': object 'GE' not found
Calls: convert2countM ... .makewide -> unlist -> strsplit -> .handleSimpleError -> h
Execution halted
Sat 13 Apr 2024 06:56:39 PM CEST
Loading required package: yaml
Loading required package: Matrix
[1] "loomR found"
Error in gzfile(file, "rb") : cannot open the connection
Calls: rds_to_loom -> readRDS -> gzfile
In addition: Warning message:
In gzfile(file, "rb") :
cannot open compressed file '/home/morales/julia/Project_633_Eckstein_Werth_Silke/Sample_23L001690_zumi/zUMIs_output/expression/Sample_23L001690.dgecounts.rds', probable reason 'No such file or directory'
Execution halted
Sat 13 Apr 2024 06:56:41 PM CEST
Descriptive statistics...
[1] "I am loading useful packages for plotting..."
[1] "2024-04-13 18:56:41 CEST"
Error in gzfile(file, "rb") : cannot open the connection
Calls: readRDS -> gzfile
In addition: Warning message:
In gzfile(file, "rb") :
cannot open compressed file '/home/morales/julia/Project_633_Eckstein_Werth_Silke/Sample_23L001690_zumi/zUMIs_output/expression/Sample_23L001690.dgecounts.rds', probable reason 'No such file or directory'
Execution halted
Sat 13 Apr 2024 06:56:45 PM CEST

Dependencies :

zUMIs version 2.9.7e
Ubuntu 20.04.6 LTS
samtools 1.10
R version 4.3.3
pigz 2.4
STAR version=2.7.11b

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R error during counting step before demultiplexing. #393

R error during counting step before demultiplexing. #393

dfmoralesb commented Apr 16, 2024

R error during counting step before demultiplexing. #393

R error during counting step before demultiplexing. #393

Comments

dfmoralesb commented Apr 16, 2024

Error in R during counting step

YAML:

Standard output:

Dependencies :