Performance improvements for copy-move detection #16
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I was looking through the code to understand what the individual parameters meant and saw a few things that looked like they could be tweaked for faster performance. I'm sure there are ways of doing the clustering in an even better way, but here's a quick attempt. On my sample image, the entire
if self.clusters is None
loop started at 84s and after this change takes 13s.The first change I made was calculating all of the distances between matching keypoints. The previous loop structure meant that something like
len(matches)^2
distances were calculated out for the original matches.I also pre-filtered down the list of matches to only the ones where the distance between the point was above min_dist instead of handling that in both the i and j loops.
The final change at master...AlexRiina:master#diff-e06009cb5a3aa444ce736838cd58dd9cR237 prevents cases where a0 and a1 are really close and a0 and b1 are really close but a1 and b1 are not really close. I think goal of that loop is to pair a0 with either a1 or b1 and b0 with the other and the highlighted case is when b0 is not paired.