You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been looking through your project and was wondering, how are you allowing for addition of new special tokens downstream after pre-training? I see some support for HF tokenizers, but newly added special tokens would need to be accounted for by calling the resize_token_embeddings() method in HF. Is there some equivalent to be able to accomplish that here?
Hey! We currently don't have any way of handling that.
If it's a feature you'd like, feel free to start a PR.
It may be slightly more complicated than the HF method, as the embedding weights are distributed across machines in the model-parallel case, but it would involve resizing the weights here:
I've been looking through your project and was wondering, how are you allowing for addition of new special tokens downstream after pre-training? I see some support for HF tokenizers, but newly added special tokens would need to be accounted for by calling the
resize_token_embeddings()
method in HF. Is there some equivalent to be able to accomplish that here?@sdtblck @StellaAthena
The text was updated successfully, but these errors were encountered: