-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significantly improve performance of SVD and SVDpp (cleaner) #401
Conversation
- adds dependency on C++ compilation
- significant performance gains - we don't ever use negative indexing, and our code is always within numpy array bounds, so this should be safe
c60896f
to
12e57dc
Compare
@ProfHercules Thank you so much for your dedication and for taking the time to submit a PR. I'm sorry I couldn't get to it sooner. Since you submitted, I've made a bunch of modifications on the Before changes on
After changes on
This PR:
I've also made a bunch of changes to this PR. Some of them are cosmetic, some of them are maybe more relevant:
On the benchmark above we observe similar speedups as you did: 6X for SVD and ~10X for SVDpp. |
from codecs import open | ||
from os import path | ||
|
||
from setuptools import Extension, find_packages, setup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to this file are unrelated
Merging now, thanks a lot again for your PR @ProfHercules , this is a very nice improvement! |
All changes are documented more clearly in the commits. Before starting to make changes, I ran all tests using Python 3.9, they all passed. After all changes, all the tests still pass.
Initially, on my M1 MacBook Air,
examples/bench_mf.py
produced the following:After all changes, the run time looks like this:
Which is a 6x improvement in speed for
SVD
and a 13.38x improvement forSVDpp
.I also ran the same benchmark using the MovieLens 1M dataset, and got the following:
The primary differences between this PR and #400 is
Note: I documented the process here, feel free to give it a read!