question about LiLT-base #21

ZHEGG · 2022-09-23T13:37:02Z

Thanks for this amazing code!
I have some question about LiLT-based, does it mean that the text stream is not combined with any pre-trained language model, and is trained from scratch with the layout stream?

jpWang · 2022-09-24T08:42:09Z

Hi,
the provided LiLT-based checkpoint is trained as described in the origin paper. It should be combined with the off-the-shelf plain text pre-trained models for fine-tuning.

ZHEGG · 2022-09-24T09:28:29Z

I went over the paper again, so LiLT-based initialize the text flow from the existing pre-trained English RoBERTa-BASE and when finetuning, it need to load RoBERTa-BASE again as the paper say "combine LiLTBASE with a new pre-trained RoBERTaBASE for finetuning"
If I am right, when finetuning in English, why reload RoBERTa-BASE again, it seems to be a little redundant

jpWang · 2022-09-25T01:37:24Z

It's a nice question. As you said, the pre-trained LiLT can be directly applied in English without re-load the text-part weights, and it can get a better result. However, we reload English RoBERTa-BASE again for consistency comparison in different languages.

ZHEGG · 2022-09-25T02:25:15Z

OK,I see, thanks for your reply

ZHEGG closed this as completed Sep 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about LiLT-base #21

question about LiLT-base #21

ZHEGG commented Sep 23, 2022

jpWang commented Sep 24, 2022 •

edited

Loading

ZHEGG commented Sep 24, 2022

jpWang commented Sep 25, 2022

ZHEGG commented Sep 25, 2022

question about LiLT-base #21

question about LiLT-base #21

Comments

ZHEGG commented Sep 23, 2022

jpWang commented Sep 24, 2022 • edited Loading

ZHEGG commented Sep 24, 2022

jpWang commented Sep 25, 2022

ZHEGG commented Sep 25, 2022

jpWang commented Sep 24, 2022 •

edited

Loading