-
Notifications
You must be signed in to change notification settings - Fork 950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Huber loss #819
Added Huber loss #819
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, and sorry for the delay! There are some things to be changed, but it looks good in general.
lasagne/objectives.py
Outdated
targets : Theano 2D tensor or 1D tensor | ||
Either a vector of int giving the correct class index per data point | ||
or a 2D tensor of one-hot encoding of the correct class in the same | ||
layout as predictions (non-binary targets in [0, 1] do not work!) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't this meant for regression? You copied the description from a multi-class loss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Yes, I will revert.
lasagne/objectives.py
Outdated
.. math:: L_\delta (diff) = \frac{diff^2}{2} & \text{if } |diff| \le \ | ||
\delta | ||
.. math:: L_\delta (diff) & = & \delta (|diff| - \frac{\delta}{2} ),\ | ||
&\text{else} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't checked what this renders like, but shouldn't this use a single .. math::
statement? http:https://www.sphinx-doc.org/en/stable/ext/math.html#directive-math
It seems rendering cases with a one-sided bracket is not easily possible in Sphinx, so using two lines is a good workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used two here because there are two different expressions that this will return, based on the value of delta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually you'd use a single expression with a conditional bracket, but that's not easily possible in Sphinx (http:https://tex.stackexchange.com/questions/122407/writing-conditional-equations-with-braces-in-sphinx).
Using two lines seems like a good alternative, but you can write two lines in a single .. math::
statement, as shown at http:https://www.sphinx-doc.org/en/stable/ext/math.html#directive-math. I guess that would be cleaner.
lasagne/objectives.py
Outdated
layout as predictions (non-binary targets in [0, 1] do not work!) | ||
delta : scalar, default 1 | ||
This delta value is defaulted to 1, for SmoothL1Loss | ||
described in Fast-RCNN paper[1]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be [1]_
, with an underscore in the end and a space before, if I recall the syntax correctly.
lasagne/objectives.py
Outdated
Notes | ||
----- | ||
This is an alternative to the Least Squared loss for | ||
regression problems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
least squares
, or better squared error
because that's what the function is called in Lasagne.
lasagne/objectives.py
Outdated
Returns | ||
------- | ||
Theano 1D tensor | ||
An expression for the item-wise huber loss. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this was copied from a multi-class objective docstring. Copy it from the squared error instead (it should be a tensor of arbitrary dimensionality, and the element-wise loss).
lasagne/objectives.py
Outdated
diff = targets - predictions | ||
ift = 0.5 * squared_error(targets, predictions) | ||
iff = delta * (abs(diff) - delta / 2.) | ||
return theano.tensor.switch(abs(diff) <= delta, ift, iff).sum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't sum in the end, it should return the element-wise loss.
lasagne/objectives.py
Outdated
https://arxiv.org/pdf/1504.08083.pdf | ||
""" | ||
predictions, targets = align_targets(predictions, targets) | ||
diff = targets - predictions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use abs_diff = abs(targets - predictions)
instead. Maximizes reuse of Theano expressions, so Theano doesn't have to merge them later.
lasagne/tests/test_objectives.py
Outdated
@@ -153,6 +153,35 @@ def test_binary_hinge_loss(colvect): | |||
|
|||
|
|||
@pytest.mark.parametrize('colvect', (False, True)) | |||
def test_huber_loss(colvect): | |||
from lasagne.objectives import huber_loss | |||
delta = [0.5, 1.0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add delta
to the test parameters and add another parametrize
. Looks a little cleaner.
---------- | ||
.. [1] Ross Girshick et al (2015): | ||
Fast RCNN | ||
https://arxiv.org/pdf/1504.08083.pdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also cite Huber, maybe as the first reference? https://en.wikipedia.org/wiki/Huber_loss#cite_note-1
Oh, and I forgot, the new function should be included in |
I have added a new commit that addresses all the comments to the best of my knowledge. I haven't verified myself what sphinx renders. There seems to be some problem with my browser(safari). If the sphinx isn't fine, please let me know I will change it and try to build that on another system and push one more commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the update! Some more comments (or repetitions of earlier comments). Almost there!
@@ -86,6 +86,7 @@ | |||
"aggregate", | |||
"binary_hinge_loss", | |||
"multiclass_hinge_loss", | |||
"huber_loss", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add it to the module docstring further above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to the .rst
file as I mentioned in #819 (comment). Thank you!
lasagne/objectives.py
Outdated
.. math:: | ||
L_\delta (diff) = \frac{diff^2}{2} & \text{if } |diff| \le \ delta \\ | ||
|
||
L_\delta (diff) & = & \delta (|diff| - \frac{\delta}{2} ),\ \\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alignment characters (&
) still seem off -- does this render correctly in Sphinx?
lasagne/objectives.py
Outdated
or a 2D tensor of one-hot encoding of the correct class in the same | ||
layout as predictions (non-binary targets in [0, 1] do not work!) | ||
Ground truth to which the prediction is to be compared | ||
with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please copy the predictions
and targets
from squared_error
. This is still not correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry, it's a little unclear to me. You mean to say, i should be having predictions and targets in a single line separated by ,
similar to squared_error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this?
Parameters
----------
predictions, targets : Theano tensors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, or you can also call it a
and b
, since the loss is completely symmetric. Your current docstring says they should be 1D or 2D tensors, which is too strict! They can be anything -- it's a drop-in replacement for the squared error.
lasagne/objectives.py
Outdated
regression problems. | ||
|
||
References | ||
---------- | ||
.. [1] Ross Girshick et al (2015): | ||
Fast RCNN | ||
https://arxiv.org/pdf/1504.08083.pdf | ||
|
||
.. [2] Huber, Peter et al (1964) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should have a colon :
in the end. Sorry for the apparent nitpick, but note that the line break is only visible when viewing the docstring in Python, not in the Sphinx rendering. Also please add a period .
after the title. Have a look at how this renders to be sure. (Last bullet point: https://github.com/Lasagne/Lasagne/blob/master/.github/PULL_REQUEST_TEMPLATE.md)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, didn't see the comment about your browser problem. Do you have another browser you can try? (Using another computer seems overkill!) Otherwise I can also check out your PR and test the documentation here, but of course it's easier if you can debug and fix things directly!
lasagne/objectives.py
Outdated
""" | ||
predictions, targets = align_targets(predictions, targets) | ||
diff = targets - predictions | ||
ab_diff = abs(targets - predictions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd strongly prefer abs_diff
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a typo. I don't have anything against abs_diff
, but yeah I understand the convention :)
lasagne/tests/test_objectives.py
Outdated
l1 = huber_loss(a, b, delta[0]) | ||
l2 = huber_loss(a, b, delta[1]) | ||
l1 = huber_loss(a, b, delta) | ||
l2 = huber_loss(a, b, delta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're computing the same thing twice now. The idea of making delta
a test parameter was to remove the duplication :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Yeah, sorry. I rushed through that PR. I will check every comment again before the next commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, and thanks a lot for including the screenshot! There's just a bug in your test.
lasagne/tests/test_objectives.py
Outdated
abs_diff = abs(x - y) | ||
ift = 0.5 * abs_diff ** 2 | ||
iff = delta * (abs_diff - delta / 2.) | ||
z = np.where(abdiff <= delta, ift, iff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd merge this right away, only all your tests failed with NameError: global name 'abdiff' is not defined
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that, I will fix it.
Looks good now, can you please squash everything into the first commit? |
Addressed comments Fixed sphinx errors changed variable name
For #814