-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Short Question About Implementation #359
Comments
Hey @adam2392, I worked mainly in the R part of things although I do remember having a similar issue with this chunk (lots of whiteboarding). I never did figure out if this block was sampling in accordance with the SPORF paper -- and given your example, I'd say it's not. In that case the indices should be sampled without replacement -- going from memory. I did tinker around in the C++ code, but the base functions came from James. |
Hi @MrAE, @jbrowne6 and @falkben
Just pinging the ppl that seemed to touch these specific LOC.
I know you guys don't maintain this code anymore and have moved on, but I had a quick question in terms of what a specific line is doing. I was wondering if you could provide a quick answer (if you happened to write this part) to make sure I'm interpreting correctly. FYI: I have ported the code to cython and once this issue is resolved, I think we can safely move on :)
In
SPORF/packedForest/src/forestTypes/binnedTree/processingNodeBin.h
Lines 99 to 113 in a7a3c7e
rndFeature = randNum->gen(fpSingleton::getSingleton().returnNumFeatures());
can generate a random feature index, but is it possible to have a duplicate?For example, say you have data with 4 columns, then maybe SPORF will sample a projection of:
Note that this in turn isn't a sparse linear combination with only +/- 1's, but now has a +2, -1 weight when doing the linear combination. Or is this function guaranteed to not have duplicates in its sampling of the projection matrix?
The text was updated successfully, but these errors were encountered: