Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about the rnnt loss arguments #18

Open
songtaoshi opened this issue May 13, 2021 · 4 comments
Open

question about the rnnt loss arguments #18

songtaoshi opened this issue May 13, 2021 · 4 comments

Comments

@songtaoshi
Copy link

songtaoshi commented May 13, 2021

        log_probs (torch.FloatTensor): Input tensor with shape (N, T, U, V)
            where N is the minibatch size, T is the maximum number of
            input frames, U is the maximum number of output labels and V is
            the vocabulary of labels (including the blank).
        labels (torch.IntTensor): Tensor with shape (N, U-1) representing the
            reference labels for all samples in the minibatch.

Hi, I am confused about the labels, why the shape should be U-1,
<eos> should not be included in the labels ?
@1ytic

@songtaoshi
Copy link
Author

and also I see the training code, the LM input ys is the same as the target ys.
This should not be +text as input; text+ as output?

@1ytic
Copy link
Owner

1ytic commented May 17, 2021

If I remember correctly, U includes "empty" output, very similar to the first element in the scoring matrix when you align two sequences, for example like this https://en.wikipedia.org/wiki/Smith–Waterman_algorithm

@zhaoyang9425
Copy link

        log_probs (torch.FloatTensor): Input tensor with shape (N, T, U, V)
            where N is the minibatch size, T is the maximum number of
            input frames, U is the maximum number of output labels and V is
            the vocabulary of labels (including the blank).
        labels (torch.IntTensor): Tensor with shape (N, U-1) representing the
            reference labels for all samples in the minibatch.

Hi, I am confused about the labels, why the shape should be U-1,
<eos> should not be included in the labels ?
@1ytic

I have the same doubt, do you understand it? Why the shape of labels be U-1?

@NiHaoUCAS
Copy link

I guess, U = len() + len(labels), len() = 1. shouldn't in the labels, but in the encoder logits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants