Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with working directories #92

Closed
poddarharsh15 opened this issue Jun 25, 2024 · 10 comments
Closed

Error with working directories #92

poddarharsh15 opened this issue Jun 25, 2024 · 10 comments

Comments

@poddarharsh15
Copy link

Hi @kcleal Could please suggest me some ideas how to modify the command line to solve this directory errors!

cmd line:-

dysgu run \\
        -p ${task.cpus} \\     
        $fasta \\
        . \\
        $input_bam \\
        | bgzip ${args2} --threads ${task.cpus} --stdout > ${prefix}.vcf.gz
    tabix ${args3} ${prefix}.vcf.gz

2024-06-25 09:07:20,781 [INFO ] [dysgu-run] Version: 1.6.2
Traceback (most recent call last):
File "/usr/local/bin/dysgu", line 11, in
sys.exit(cli())
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/dysgu/main.py", line 238, in run_pipeline
make_wd(kwargs)
File "/usr/local/lib/python3.10/site-packages/dysgu/main.py", line 125, in make_wd
raise ValueError("Working directory already exists. Add -x / --overwrite=True to proceed, "
ValueError: Working directory already exists. Add -x / --overwrite=True to proceed, or supply --ibam to re-use temp files in working directory

Thanks.

@kcleal
Copy link
Owner

kcleal commented Jun 25, 2024

Hi @poddarharsh15,
It looks like you are trying to use the current directory as the temp directory .. You will need to add -x to your command to overwrite any temp files

@poddarharsh15
Copy link
Author

Thank you for quick reply can I modify command line like this:-

dysgu run \\
        -p ${task.cpus} \\     
        -x \\
        $fasta \\
        . \\
        $input_bam \\
        | bgzip ${args2} --threads ${task.cpus} --stdout > ${prefix}.vcf.gz
    tabix ${args3} ${prefix}.vcf.gz

Is this okay?

@kcleal
Copy link
Owner

kcleal commented Jun 25, 2024

Yes!

@poddarharsh15
Copy link
Author

I am running the above mentioned command line, still I am not able to emit any outputs Could you please have a look at the log file, Thanks.

Input bam

2024-06-25 11:36:07,112 [INFO ] [dysgu-run] Version: 1.6.2
2024-06-25 11:36:07,112 [INFO ] run -p 2 -x genome.fasta . test.paired_end.recalibrated.sorted.bam
2024-06-25 11:36:07,112 [INFO ] Destination: .
2024-06-25 11:36:07,382 [INFO ] dysgu fetch test.paired_end.recalibrated.sorted.bam written to ./test.paired_end.recalibrated.sorted.dysgu_reads.bam, n=252, time=0:00:00 h:m:s
2024-06-25 11:36:07,382 [INFO ] Input file is: ./test.paired_end.recalibrated.sorted.dysgu_reads.bam
[W::hts_idx_load3] The index file is older than the data file: test.paired_end.recalibrated.sorted.bam.bai
2024-06-25 11:36:07,385 [INFO ] Sample name: normal
2024-06-25 11:36:07,385 [INFO ] Writing vcf to stdout
2024-06-25 11:36:07,385 [INFO ] Running pipeline
2024-06-25 11:36:07,552 [INFO ] Calculating insert size. Removed 0 outliers with insert size >= 1359.0
2024-06-25 11:36:07,563 [INFO ] Inferred read length 100.0, insert median 351, insert stdev 145
2024-06-25 11:36:07,564 [INFO ] Max clustering dist 1076
2024-06-25 11:36:07,564 [INFO ] Building graph with clustering 1076 bp
2024-06-25 11:36:07,566 [INFO ] Total input reads 252
2024-06-25 11:36:07,566 [INFO ] Graph constructed
2024-06-25 11:36:07,566 [INFO ] Minimum support 3
2024-06-25 11:36:07,602 [CRITICA] No events found
2024-06-25 11:36:07,602 [INFO ] dysgu run test.paired_end.recalibrated.sorted.bam complete, time=0:00:00 h:m:s

Input cram

2024-06-25 11:36:18,595 [INFO ] [dysgu-run] Version: 1.6.2
2024-06-25 11:36:18,595 [INFO ] run -p 2 -x genome.fasta . test.paired_end.recalibrated.sorted.cram
2024-06-25 11:36:18,595 [INFO ] Destination: .
2024-06-25 11:36:19,056 [INFO ] dysgu fetch test.paired_end.recalibrated.sorted.cram written to ./test.paired_end.recalibrated.sorted.dysgu_reads.bam, n=252, time=0:00:00 h:m:s
2024-06-25 11:36:19,057 [INFO ] Input file is: ./test.paired_end.recalibrated.sorted.dysgu_reads.bam
2024-06-25 11:36:19,058 [INFO ] Sample name: normal
2024-06-25 11:36:19,058 [INFO ] Writing vcf to stdout
2024-06-25 11:36:19,058 [INFO ] Running pipeline
2024-06-25 11:36:19,509 [INFO ] Calculating insert size. Removed 0 outliers with insert size >= 1359.0
2024-06-25 11:36:19,519 [INFO ] Inferred read length 100.0, insert median 351, insert stdev 145
2024-06-25 11:36:19,520 [INFO ] Max clustering dist 1076
2024-06-25 11:36:19,520 [INFO ] Building graph with clustering 1076 bp
2024-06-25 11:36:19,522 [INFO ] Total input reads 252
2024-06-25 11:36:19,522 [INFO ] Graph constructed
2024-06-25 11:36:19,522 [INFO ] Minimum support 3
2024-06-25 11:36:19,556 [CRITICA] No events found
2024-06-25 11:36:19,556 [INFO ] dysgu run test.paired_end.recalibrated.sorted.cram complete, time=0:00:00 h:m:s

@kcleal
Copy link
Owner

kcleal commented Jun 25, 2024

The log suggests only 252 reads were in your test.paired_end.recalibrated.sorted.cram file, is this correct?

@poddarharsh15
Copy link
Author

252

Yes it is correct because these are the test_samples which are extremely small size. <300kb

@kcleal
Copy link
Owner

kcleal commented Jun 25, 2024

Perhaps there are no SVs present? Alternatively you can try adjusting the min-support parameter, it is set at 3 by default for paired-end reads.

@poddarharsh15
Copy link
Author

poddarharsh15 commented Jun 25, 2024

Probably yes I was running on other test_samples which are a bit larger in size and dysgu was successfully able to detect SVs,
--min-support TEXT Minimum number of reads per SV [default: 3] I can change it to maybe 0?

dysgu run \
    -p ${cpus} \
    -x \
    --min-support 0
    ${fasta} \
    . \
    ${input_bam} \
    | bgzip ${args2} --threads ${cpus} --stdout > ${prefix}.vcf.gz

tabix ${args3} ${prefix}.vcf.gz

May I ask for running .cram files I need to add a specific parameters?

@kcleal
Copy link
Owner

kcleal commented Jun 25, 2024

There a no additional parameters to use a cram file. Min-support 0 or 1 will have the same effect, i.e. at least one bit of evidence for a SV call

@poddarharsh15
Copy link
Author

I have tried both 0 and 1 parameters and still there is no change in results, I suppose there's no SV present in the test_data :(

@kcleal kcleal closed this as completed Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants