Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated qlora.py to fix freezing of embedding layers #217

Merged
merged 1 commit into from
Jul 19, 2023

Conversation

ffohturk
Copy link
Contributor

Updated the way embeddings are frozen by changing the order of operations. In the original codebase, the model was loaded, LoRA-fied and then the tokenizer was resized. The resizing resets the gradients of the embedding layers and puts them to default (i.e. True). This is not what you want and to fix this, you can simple do a different order: First load the base model, then resize tokenizer, then do prepare_for_kbit_training to freeze the model's original weights, then LoRA-fy and then train.

Updated the way embeddings are frozen. First load the base model, then resize tokenizer, then do prepare_for_kbit_training to freeze the model's original weights, then LoRA-fy and then train.
Copy link
Owner

@artidoro artidoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ffohturk Thanks for helping streamline the code! This makes more sense and will make it easier for people to understand what is going on. I verified that the embeddings are indeed frozen before training.

@artidoro artidoro merged commit 4a3e5dd into artidoro:main Jul 19, 2023
1 check passed
LagPixelLOL pushed a commit to LagPixelLOL/qlora that referenced this pull request Feb 8, 2024
Updated qlora.py to fix freezing of embedding layers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants