About XLNet Parameters #6

woshierniu · 2021-12-12T10:44:41Z

Hello, thank you very much for sharing your codes and idea.
①I encountered some problems when I was only running XLNet (Python train. Py) for NER, as follows, can you tell me why ?

Traceback (most recent call last):
File "C:/Users/94312/Desktop/ner-combining-contextual-and-global-features-master/xlnet-ner/train.py", line 201, in
torch.save(model, f"{fname}.pt")
File "C:\Users\94312\Desktop\NER-pytorch-master\venv\lib\site-packages\torch\serialization.py", line 379, in save
_save(obj, opened_zipfile, pickle_module, pickle_protocol)
File "C:\Users\94312\Desktop\NER-pytorch-master\venv\lib\site-packages\torch\serialization.py", line 484, in _save
pickler.dump(obj)
TypeError: can't pickle torch._C.ScriptFunction objects

②And can you tell me the optimal parameters when only running XLNet ? Is it consistent with the parameters in the code xlnet-ner/train.py? Is ''finetuning == true'' ?

parser. add_ argument("--batch_size", type=int, default=128)
parser. add_ argument("--lr", type=float, default=0.0001)
parser. add_ argument("--n_epochs", type=int, default=30)

Looking forward to your reply, thank you.

honghanhh · 2021-12-12T16:02:16Z

Hi @woshierniu, thank you for your interest in NER using XLNet in combination with GCNs.

① Is the issue the same with the combined version or only with XLNet? It works fine from my side so I will double-check about that.

② The optimal parameters I applied when I implement XLNet:

Batch size: 16
Learning rate: 1e-5
Dropout: 0.2

I hope that it would help. Thanks!

woshierniu · 2021-12-13T02:50:51Z

Hi @woshierniu, thank you for your interest in NER using XLNet in combination with GCNs.

① Is the issue the same with the combined version or only with XLNet? It works fine from my side so I will double-check about that.

② The optimal parameters I applied when I implement XLNet:

Batch size: 16

Learning rate: 1e-5

Dropout: 0.2

I hope that it would help. Thanks!

Hi , @honghanhh , Thank you very much for your reply. I have solved the first run error. I will try the second parameter according to what you provide.

① Can you tell me is ''finetuning == true'' when only training XLNet ? And when I only want to use XLNet for NER, do I only need to run ''train.py'' ?
② the NER result mentioned in your paper is fuzzy matching, but the latest research progress result are generally exact matching. Do you record the exact matching results of NER made by XLNet combined with GCN ？

Looking forward to your reply, thank you ！

honghanhh · 2021-12-17T03:35:11Z

Hi @woshierniu, sorry for my late reply.

① Yes, you can use train.py to experiment only on XLNet with my suggested optimal parameters.
② The final result is the combination between optimal hyperparameters as below:
XLNet:

Batch size: 16
Learning rate: 1e-5
Dropout: 0.2
GCN:
Batch size: 8
Learning rate: 3e-4

I hope that it would help!

woshierniu · 2021-12-17T04:36:03Z

Hi @woshierniu, sorry for my late reply.

① Yes, you can use train.py to experiment only on XLNet with my suggested optimal parameters. ② The final result is the combination between optimal hyperparameters as below: XLNet:

Batch size: 16

Learning rate: 1e-5

Dropout: 0.2
GCN:

Batch size: 8

Learning rate: 3e-4

I hope that it would help!

Thank you very much for your patient reply. I will try your best parameters, which will be very helpful to me!

woshierniu · 2021-12-23T15:18:27Z

Hi @woshierniu, sorry for my late reply.

① Yes, you can use train.py to experiment only on XLNet with my suggested optimal parameters. ② The final result is the combination between optimal hyperparameters as below: XLNet:

Batch size: 16

Learning rate: 1e-5

Dropout: 0.2
GCN:

Batch size: 8

Learning rate: 3e-4

I hope that it would help!

Sorry, I have another question. Does xlnet use for fine-tuning or just extract features as embeddings?

honghanhh · 2021-12-23T15:52:17Z

Hi @woshierniu,

For the proposed model, I used Xlnet as contextual embeddings in combination with global embedding from GCN.
For XLNet standalone, I used XLNet for fine-tuning with an additional Linear layer at the end.

Please check out my paper for the details in the architecture and how each model works at
https://arxiv.org/abs/2112.08033

Please feel free to contact me if you have any concerns. Happy Xmas!

woshierniu · 2021-12-23T16:30:50Z

Hi @woshierniu,

For the proposed model, I used Xlnet as contextual embeddings in combination with global embedding from GCN.

For XLNet standalone, I used XLNet for fine-tuning with an additional Linear layer at the end.

Please check out my paper for the details in the architecture and how each model works at https://arxiv.org/abs/2112.08033

Please feel free to contact me if you have any concerns. Happy Xmas!

Happy Xmas! @honghanhh
① Thank you for your timely reply. I read your paper. You mentioned that you used relaxed matching in your paper. I still have some questions about this. The evaluation standard used in the latest NER Research (BERT / flair, etc.) is relaxed matching or the exact matching using evaluation tools?

② The language model of xlnet / Bert in the code, does the initial input not need word vectors such as Glove?

honghanhh · 2022-07-25T07:08:11Z

Hi @woshierniu,

For the proposed model, I used Xlnet as contextual embeddings in combination with global embedding from GCN.

For XLNet standalone, I used XLNet for fine-tuning with an additional Linear layer at the end.

Please check out my paper for the details in the architecture and how each model works at https://arxiv.org/abs/2112.08033
Please feel free to contact me if you have any concerns. Happy Xmas!

Happy Xmas! @honghanhh ① Thank you for your timely reply. I read your paper. You mentioned that you used relaxed matching in your paper. I still have some questions about this. The evaluation standard used in the latest NER Research (BERT / flair, etc.) is the relaxed matching or exact matching using evaluation tools?

② The language model of xlnet / Bert in the code, does the initial input not need word vectors such as Glove?

Sorry for my late reply. We used relaxed matching to better compare with other SOTA papers which share the same evaluation approaches at the moment we publish our paper.
XLNet uses SentencePiece to construct tokenizer. I believe you can customize with Glove but maybe the results are not as good as the default.

woshierniu changed the title ~~About XLNet Run Error~~ About XLNet Parameters Dec 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About XLNet Parameters #6

About XLNet Parameters #6

woshierniu commented Dec 12, 2021 •

edited

Loading

honghanhh commented Dec 12, 2021 •

edited

Loading

woshierniu commented Dec 13, 2021

honghanhh commented Dec 17, 2021

woshierniu commented Dec 17, 2021

woshierniu commented Dec 23, 2021

honghanhh commented Dec 23, 2021 •

edited

Loading

woshierniu commented Dec 23, 2021 •

edited

Loading

honghanhh commented Jul 25, 2022

About XLNet Parameters #6

About XLNet Parameters #6

Comments

woshierniu commented Dec 12, 2021 • edited Loading

honghanhh commented Dec 12, 2021 • edited Loading

woshierniu commented Dec 13, 2021

honghanhh commented Dec 17, 2021

woshierniu commented Dec 17, 2021

woshierniu commented Dec 23, 2021

honghanhh commented Dec 23, 2021 • edited Loading

woshierniu commented Dec 23, 2021 • edited Loading

honghanhh commented Jul 25, 2022

woshierniu commented Dec 12, 2021 •

edited

Loading

honghanhh commented Dec 12, 2021 •

edited

Loading

honghanhh commented Dec 23, 2021 •

edited

Loading

woshierniu commented Dec 23, 2021 •

edited

Loading