Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nll_loss with weights: reduction 'mean' gives wrong result #31295

Closed
JoveIC opened this issue Dec 15, 2019 · 4 comments
Closed

nll_loss with weights: reduction 'mean' gives wrong result #31295

JoveIC opened this issue Dec 15, 2019 · 4 comments
Assignees
Labels
high priority triage review triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@JoveIC
Copy link

JoveIC commented Dec 15, 2019

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

torch.manual_seed(42)
torch.cuda.manual_seed(42)

  1. on cpu
>>> i = torch.randn(3, 5, requires_grad=True, device='cpu')
tensor([[ 0.3367,  0.1288,  0.2345,  0.2303, -1.1229],
        [-0.1863,  2.2082, -0.6380,  0.4617,  0.2674],
        [ 0.5349,  0.8094,  1.1103, -1.6898, -0.9890]], requires_grad=True)

>>> w = torch.randn(5, device='cpu')
tensor([ 0.9580,  1.3221,  0.8172, -0.7658, -0.7506])

>>> target = torch.tensor([1, 0, 4], device='cpu')
tensor([1, 0, 4])

>>> reduction = ['none', 'sum', 'mean']
>>> for r in reduction:
...        m = F.log_softmax(i, dim=1)
...        loss = F.nll_loss(m, target, w, reduction=r)

none :  tensor([ 2.0560,  2.6612, -2.2593], grad_fn=<NllLossBackward>)
sum :  tensor(2.4579, grad_fn=<NllLossBackward>)
mean :  tensor(1.6070, grad_fn=<NllLossBackward>)
  1. on gpu
>>> i = torch.randn(3, 5, requires_grad=True, device='cuda:0')
tensor([[ 0.1940,  2.1614, -0.1721,  0.8491, -1.9244],
        [ 0.6530, -0.6494, -0.8175,  0.5280, -1.2753],
        [-1.6621, -0.3033, -0.0926,  0.1992, -1.1204]], device='cuda:0',
       requires_grad=True)

>>> w = torch.randn(5, device='cuda:0')
tensor([ 0.1391, -0.1082, -0.7174,  0.7566,  0.3715], device='cuda:0')

> >> target = torch.tensor([1, 0, 4], device='cuda:0')`
tensor([1, 0, 4], device='cuda:0')

>>> reduction = ['none', 'sum', 'mean']
>>> for r in reduction:
...        m = F.log_softmax(i, dim=1)
...        loss = F.nll_loss(m, target, w, reduction=r)

none :  tensor([-0.0455,  0.1291,  0.8693], device='cuda:0', grad_fn=<NllLossBackward>)
sum :  tensor(0.9530, device='cuda:0', grad_fn=<NllLossBackward>)
mean :  tensor(2.3681, device='cuda:0', grad_fn=<NllLossBackward>)

Expected behavior

>> loss = F.nll_loss(m, target, w, reduction='none')
>> loss.mean()

mean : tensor(0.8193, grad_fn=<MeanBackward0>)

Environment

PyTorch version: 1.3.1
Is debug build: No
CUDA used to build PyTorch: 10.1.243

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: version 3.0.2

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration:
GPU 0: TITAN X (Pascal)
GPU 1: TITAN X (Pascal)

Nvidia driver version: 430.50
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:
[pip3] numpy==1.13.3
[conda] blas 1.0 mkl
[conda] mkl 2019.4 243
[conda] mkl-service 2.3.0 py36he904b0f_0
[conda] mkl_fft 1.0.14 py36ha843d7b_0
[conda] mkl_random 1.1.0 py36hd6b4f25_0
[conda] pytorch 1.3.1 py3.6_cuda10.1.243_cudnn7.6.3_0 pytorch
[conda] torchvision 0.4.2 py36_cu101 pytorch

cc @ezyang @gchanan @zou3519

@JoveIC JoveIC changed the title Cross entropy loss with weights: reduction 'mean' gives wrong result nll_loss with weights: reduction 'mean' gives wrong result Dec 15, 2019
@zou3519 zou3519 added high priority module: operators triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Dec 16, 2019
@zou3519
Copy link
Contributor

zou3519 commented Dec 16, 2019

Seems bad if there is a correctness issue. I haven't checked the code, so we should first verify that the behavior is indeed correct. nll_loss is an important loss function.

@anjali411 anjali411 self-assigned this Dec 17, 2019
@gchanan
Copy link
Contributor

gchanan commented Dec 17, 2019

This isn't actually a bug -- @anjali411 is going to post why.

@anjali411
Copy link
Contributor

This is not a bug. The loss in case of mean is calculated by the following formula mentioned in the documentation https://pytorch.org/docs/stable/nn.html#nllloss
According to the formula:
For CPU example:
when reduction = mean, loss = Σ ln/(w1+w0+w4) for n=1 to 3 where li's are the elements in the loss tensor for reduction = none
Thus loss = (2.0560 + 2.6612 + (-2.2593))/(0.9580+1.3221-0.7506) = 1.6070

For GPU example:

when reduction = mean, loss = Σ ln/(w1+w0+w4) for n=1 to 3
= (-0.0455 + 0.1291 + 0.8693)/ (0.1391 -0.1082 + 0.3715) = 2.3681

Clarification for documentation yn = nth element in the target tensor

@JoveIC
Copy link
Author

JoveIC commented Dec 18, 2019

@anjali411 Thank you for the explanation

facebook-github-bot pushed a commit that referenced this issue Dec 19, 2019
…#31488)

Summary:
Reference: #31385

In the current documentation for NLLLoss, it's unclear what `y` refers to in the math section of the loss description. There was an issue(#31295) filed earlier where there was a confusion if the loss returned for reduction=mean is right or not, perhaps because of lack in clarity of formula symbol description in the current documentation.
Pull Request resolved: #31488

Differential Revision: D19181391

Pulled By: anjali411

fbshipit-source-id: 8b75f97aef93c92c26ecbce55b3faf2cd01d3e74
wuhuikx pushed a commit to wuhuikx/pytorch that referenced this issue Jan 30, 2020
…pytorch#31488)

Summary:
Reference: pytorch#31385

In the current documentation for NLLLoss, it's unclear what `y` refers to in the math section of the loss description. There was an issue(pytorch#31295) filed earlier where there was a confusion if the loss returned for reduction=mean is right or not, perhaps because of lack in clarity of formula symbol description in the current documentation.
Pull Request resolved: pytorch#31488

Differential Revision: D19181391

Pulled By: anjali411

fbshipit-source-id: 8b75f97aef93c92c26ecbce55b3faf2cd01d3e74
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority triage review triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

4 participants