Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding additional .mmi extensions to all possible fasta file format l… #330

Merged
merged 8 commits into from
Feb 5, 2024

Conversation

mattloose
Copy link
Contributor

…abels to catch acceptable file types.

This request addresses #326 and enables validation of the following file type endings.

['.fasta.gz', '.fna.gz', '.fsa.gz', '.fa.gz', '.fastq.gz', '.fq.gz', '.fasta.mmi', '.fna.mmi', '.fsa.mmi', '.fa.mmi', '.fastq.mmi', '.fq.mmi', '.fasta', '.fna', '.fsa', '.fa', '.fastq', '.fq', '.mmi']

I think a better way to handle this in the future would be to explicitly detemine if the file is readable by mappy. However, this will catch the most common possible mmi extension types.

@mattloose mattloose requested a review from Adoni5 February 1, 2024 20:42
@alexomics
Copy link
Contributor

I don't think that the current fix addresses the problem in the issue. Which is an extra . in the filename (GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.mmi), so pathlib suggests that the suffix is .15_grch38_no_alt_analysis_set.fna.mmi

We could use something like:

file_extensions = ['.fasta', '.fna', '.fsa', '.fa', '.fastq', 
                   '.fq', '.fasta.gz', '.fna.gz', '.fsa.gz', 
                   '.fa.gz', '.fastq.gz', '.fq.gz', '.mmi']
if not any(index.lower().endswith(suffix) for suffix in file_extensions):
    raise ...

Probably should add a test case TOML for this too.

@mattloose
Copy link
Contributor Author

I mean - it does fix it...

But I'll look at this alternative.

@mattloose
Copy link
Contributor Author

OK
It doesn't fix it.

Whoops.

Copy link
Contributor

@alexomics alexomics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would also add another test case that has a valid suffix but strange filename e.g. test.v1.fna.mmi

src/readfish/plugins/_mappy.py Outdated Show resolved Hide resolved
@mattloose
Copy link
Contributor Author

This addresses the issue raised by @alexomics (thanks - good spot).

I've attempted to add in a test toml but I don't think this is yet correct - advice from anyone appreciated.

@alexomics
Copy link
Contributor

So the TOML tests are a bit of a Rube Goldberg machine, you need to:

  1. create a TOML file (readfish/tests/static/toml_validation_test/fail/005_bad_reference_file.TOML)
  2. create a matching TXT file (readfish/tests/static/toml_validation_test/fail/005_bad_reference_file.txt) that contains the expected error message(s) the TOML will generate

It's described a little bit more on the README for each test folder.

@mattloose
Copy link
Contributor Author

re: the tests - the output of the readfish validate command won't error on this file now - it will just give a warning - can that be caught in a text file? I note that in the other examples for validating mappy the text file is just empty?

@mattloose
Copy link
Contributor Author

OK - this adds a fail and pass test for these specific issues - the pass reference index file is named as per the issue that started this.

@alexomics
Copy link
Contributor

Almost there! You seem to be missing a TOML file tests/static/mappy_validation_test/fail/004_bad_reference_file_extension.toml

@mattloose
Copy link
Contributor Author

D'oh

Should be there now. I forgot I needed to force add the toml due to the gitignore!

@alexomics alexomics merged commit f5e40c9 into main Feb 5, 2024
9 checks passed
@alexomics alexomics deleted the issue326/reference_naming branch February 5, 2024 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants