Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix default number_of_noise_barcodes for hashsolo #2190

Merged
merged 5 commits into from
Mar 30, 2022
Merged

Fix default number_of_noise_barcodes for hashsolo #2190

merged 5 commits into from
Mar 30, 2022

Conversation

LustigePerson
Copy link
Contributor

@LustigePerson LustigePerson commented Mar 24, 2022

It seems like the last PR #1483 1504a36 by @njbernstein broke the way the number_of_noise_barcodes worked.

Like in the original solo repository the number_of_**non**_noise_barcodes should be 2 as a default OR the number_of_barcodes - number_of_noise_barcodes.

However, changing the default number_of_noise_barcodes from None to 2 and changing:

number_of_non_noise_barcodes = (
        num_of_barcodes - number_of_noise_barcodes
        if number_of_noise_barcodes is not None
        else 2
    )

to

number_of_non_noise_barcodes = num_of_barcodes - number_of_noise_barcodes

led to the situation, that the used number_of_non_noise_barcodes was number_of_barcodes - 2 as a default instead of 2.

This leads to problems with demultiplexing, especially with multiple barcodes. Hence, I changed this back.

I also modified the check for a valid input to cope for the None default.

if number_of_noise_barcodes >= len(cell_hashing_columns):

changed to

if (number_of_noise_barcodes is not None) and (
        number_of_noise_barcodes >= len(cell_hashing_columns)
    ):

The rest of changes was introduced by black formatting.

commit #8eeb3fc229d5daee8961c32756becae4b1504a36 broke correct calculation of `number_of_noise_barcodes`
@codecov
Copy link

codecov bot commented Mar 24, 2022

Codecov Report

Merging #2190 (e7ae2d5) into master (4322e82) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #2190   +/-   ##
=======================================
  Coverage   71.95%   71.95%           
=======================================
  Files          98       98           
  Lines       11537    11537           
=======================================
  Hits         8301     8301           
  Misses       3236     3236           
Impacted Files Coverage Δ
scanpy/external/pp/_hashsolo.py 89.18% <100.00%> (ø)

@njbernstein
Copy link
Contributor

@LustigePerson yup this was my bad. Thanks for fixing this. much appreciated

@Zethson
Copy link
Member

Zethson commented Mar 26, 2022

Thank you!

Would it be possible to catch and validate this behavior with a test here https://github.com/theislab/scanpy/blob/master/scanpy/tests/external/test_hashsolo.py ?

@ivirshup
Copy link
Member

@LustigePerson, would you be able to add a quick test here? Then this could get into the next release

With the given test data the signal was to clear for an acutal test case.
@LustigePerson
Copy link
Contributor Author

LustigePerson commented Mar 29, 2022

I just increased the noise in the testdata. With this, the assertion fails with the wrong number of noise barcodes.
The original test data was to "clean".

@ivirshup
Copy link
Member

LGTM, thanks for the fix!

@ivirshup ivirshup merged commit b8c2cf5 into scverse:master Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area – External Dealing with external tools Bug 🐛
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants