Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update match function #166

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Update match function #166

wants to merge 2 commits into from

Conversation

mr-eyes
Copy link

@mr-eyes mr-eyes commented Mar 4, 2018

Short explanation

In the following lines at kmerindex::match() function the right logic in the if condition is followed.
Lines: L1069, L1128, L1178, L1195
!= kmap.end() is equivalent to // if k-mer found
and if != kmap.end() returns true that means search variables contains result of successfull alignment in the hashTable so the following Line#1100 the found2 flag state changed to be true as an indication that search2 has kmer match while the condition == kmap.end() check if kmer not found then raise the flag.
So, in the Line#1099 the condition needs to be fixed to check if the search2 has kmer match by changing the == to !=

Summary: Check if kmer is found instead of checking if the kmer is not found


Detailed explanation

Data used in the test

1- Reference

File name Description Visualize TDBG
Reference 1 Four Unique Transcripts GFA
Reference 2 Reference 1 + Deletion from Tr1 GFA
Reference 3 Reference 2 + Fusion between Tr2 & Tr3 GFA

2- Reads

File name Description Reference Index
1.fa R1 ⊂ Tr1, R2 ⊂ Tr2, R3 ⊂ Tr3, R4 ⊂ Tr4 reference1.idx
2.fa R1 ⊂ Tr1, R2 ⊂ Tr2, R3 ⊂ Tr3, R4 ⊂ Tr4, R5 ⊂ Tr5 reference2.idx
3.fa R1 ⊂ (Tr1 & Tr5), R2 ⊂ Tr2, R3 ⊂ Tr3, R4 ⊂ Tr4, R5 ⊂ Tr5 reference2.idx
4.fa R1 ⊂ Tr1, R2 ⊂ Tr2, R3 ⊂ Tr3, R4 ⊂ Tr4, R5 ⊂ Tr5, R6 ⊂ Tr6 reference3.idx
5.fa R1 ⊂ Tr1, R2 ⊂ Tr2, R3 ⊂ Tr3, R4 ⊂ Tr4, R5 ⊂ Tr5, R6 ⊂ (Tr6 & Tr2) reference3.idx
6.fa R1 ⊂ Tr1, R2 ⊂ Tr2, R3 ⊂ Tr3, R4 ⊂ Tr4, R5 ⊂ Tr5, R6 ⊂ (Tr2 & Tr3) reference2.idx

Each read is 100pb

The unexpected behavior happends when pseudo-aligning the 6.fa with reference1 which has the original 4 unique transcripts.

3- Results

Vector V after pseudo-alignment

Original master

(Debugging_Print) SORTED KmerEntry vector(V)
KmerEntry: pos:91 | contig: 2 | contig_Length: 2588 | read_pos: 0
KmerEntry: pos:92 | contig: 2 | contig_Length: 2588 | read_pos: 1
KmerEntry: pos:93 | contig: 2 | contig_Length: 2588 | read_pos: 2
KmerEntry: pos:94 | contig: 2 | contig_Length: 2588 | read_pos: 3
KmerEntry: pos:95 | contig: 2 | contig_Length: 2588 | read_pos: 4
KmerEntry: pos:106 | contig: 3 | contig_Length: 965 | read_pos: 2496

The last read_pos is equal to 2496 while the read length is 100pb !

After the edit !=kmap.end()

KmerEntry: pos:91 | contig: 2 | contig_Length: 2588 | read_pos: 0
KmerEntry: pos:91 | contig: 2 | contig_Length: 2588 | read_pos: 0

After the edit !=kmap.end() & v.push_back({search2->second, kit2->second});

KmerEntry: pos:91 | contig: 2 | contig_Length: 2588 | read_pos: 0
KmerEntry: pos:91 | contig: 2 | contig_Length: 2588 | read_pos: 0
KmerEntry: pos:139 | contig: 3 | contig_Length: 965 | read_pos: 69

That makes sense, there's no need in this case to enter the this is weird, let's try the middle k-mer section or to backOff


Side Note

This bug was discovered unintentionally when adding the option Union of Compatibility classes instead of the default Intersect
NSolid
To be able to create new compatibility classes if there's no intersection between two transcripts in the reference just like when pseudo-aligning the 6.fa on reference1 or reference2

@mr-eyes mr-eyes changed the base branch from master to devel March 4, 2018 23:09
@mr-eyes mr-eyes changed the base branch from devel to master March 4, 2018 23:12
@mschilli87 mschilli87 mentioned this pull request Mar 5, 2018
@mr-eyes mr-eyes closed this May 13, 2018
@mschilli87
Copy link

@mr-eyes: Why was this closed now?

@mr-eyes
Copy link
Author

mr-eyes commented May 14, 2018

@mschilli87 sorry, I was closing the other one (duplicate) , will reopen this.
Hope to know soon whether it's really a bug or kallisto does not support processing fusion transcripts.

@mr-eyes mr-eyes reopened this May 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants