Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about LiLT-base #21

Closed
ZHEGG opened this issue Sep 23, 2022 · 4 comments
Closed

question about LiLT-base #21

ZHEGG opened this issue Sep 23, 2022 · 4 comments

Comments

@ZHEGG
Copy link

ZHEGG commented Sep 23, 2022

Thanks for this amazing code!
I have some question about LiLT-based, does it mean that the text stream is not combined with any pre-trained language model, and is trained from scratch with the layout stream?

@jpWang
Copy link
Owner

jpWang commented Sep 24, 2022

Hi,
the provided LiLT-based checkpoint is trained as described in the origin paper. It should be combined with the off-the-shelf plain text pre-trained models for fine-tuning.

@ZHEGG
Copy link
Author

ZHEGG commented Sep 24, 2022

I went over the paper again, so LiLT-based initialize the text flow from the existing pre-trained English RoBERTa-BASE and when finetuning, it need to load RoBERTa-BASE again as the paper say "combine LiLTBASE with a new pre-trained RoBERTaBASE for finetuning"
If I am right, when finetuning in English, why reload RoBERTa-BASE again, it seems to be a little redundant

@jpWang
Copy link
Owner

jpWang commented Sep 25, 2022

It's a nice question. As you said, the pre-trained LiLT can be directly applied in English without re-load the text-part weights, and it can get a better result. However, we reload English RoBERTa-BASE again for consistency comparison in different languages.

@ZHEGG
Copy link
Author

ZHEGG commented Sep 25, 2022

OK,I see, thanks for your reply

@ZHEGG ZHEGG closed this as completed Sep 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants