Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why the outputs of the RethinkNet is all close to zero? #1

Open
Tenyn opened this issue Mar 26, 2020 · 8 comments
Open

why the outputs of the RethinkNet is all close to zero? #1

Tenyn opened this issue Mar 26, 2020 · 8 comments

Comments

@Tenyn
Copy link

Tenyn commented Mar 26, 2020

I trained the net on bibtex dataset.
The loss function is binary crossentropy.

Thank you.

@yangarbiter
Copy link
Owner

Probably not "all" are close to zero?
Since bibtex dataset has a lot of labels, it is possible that many of the labels are simply not related to the feature, thus having an output close to zero

@Tenyn
Copy link
Author

Tenyn commented Mar 27, 2020

Thanks for your reply.

The outputs of all labels are below 0.1 so i wander if the wrong model was built.

The model is shown as follows:
`
def RethinkNet(input_shape, n_labels):

inputs = Input(shape=input_shape[1:])
x = RepeatVector(input_shape[0])(inputs)

x = Dense(128, kernel_regularizer=l2(0.0001), activation='relu')(x)

x = LSTM(128, return_sequences=True,
         recurrent_regularizer=l2(0.0001),
         kernel_regularizer=l2(0.0001),
         recurrent_dropout=0.25, 
         activation='sigmoid')(x)

outputs = Dense(n_labels,kernel_regularizer=l2(0.0001), activation='sigmoid')(x)

model = Model(inputs=[inputs], outputs=[outputs])
model.compile(loss='binary_crossentropy', optimizer=Nadam(lr=0.001), metrics=['accuracy'])

return model

`

@yangarbiter
Copy link
Owner

Have you train the model and have the loss converged?

@Tenyn
Copy link
Author

Tenyn commented Mar 27, 2020

I trained it 300 epochs, and the loss converges to 0.0746

@yangarbiter
Copy link
Owner

yangarbiter commented Mar 27, 2020

Which cost function are you training the model with?
If you trained RethingNet with Hamming loss (without the reweighting), it is possible with such result due to the nature of Hamming loss.
Because with Hamming loss, when the positive label count is small, it is easy to have a low Hamming loss when you predict everything 0.
One thing to check is to check with the model is whether the Hamming loss is small and play with other cost functions like F1 score.

@Tenyn
Copy link
Author

Tenyn commented Mar 27, 2020

The cost function is binary crossentropy. So i dont understand why the outputs have no labels close to 1.

@yangarbiter
Copy link
Owner

yangarbiter commented Mar 27, 2020

For training the RethinkNet, it is in fact training on a weighted binary crossentropy loss.

You can check the implementation here.
https://github.com/yangarbiter/multilabel-learn/blob/master/mlearn/models/rethinknet/rethinkNet.py

If using the binary crossentropy alone without changing the weight.
Let's say the dataset have 1 label labeled 1 and another 99 labels labeled 0, the easiest solution for the model is to learn is to predict every label being 0.
Then you would have a 99% accuracy on the labels.
Thus the weighting on the binary crossentropy is important.

@Tenyn
Copy link
Author

Tenyn commented Mar 27, 2020

Thanks.
I will try the weighted binary crossentropy loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants