{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":570384908,"defaultBranch":"main","name":"peft","ownerLogin":"huggingface","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2022-11-25T03:51:09.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/25720743?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1716887746.0","currentOid":""},"activityList":{"items":[{"before":"cb0bf077744d11524ec6f68d920f4cfe4ef3e8f3","after":"a0788a3f92c8220f68d2185aeef0266d6b725bfe","ref":"refs/heads/main","pushedAt":"2024-05-31T14:56:21.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"Refactor to make DoRA and QDoRA work with FSDP (#1806)\n\nThis PR moves all the DoRA functionality into a separate module class.\r\nEssentially, this is necessary because otherwise, the DoRA parameter\r\nlives on the lora.Linear layer as a parameter, not a separate module.\r\nSince FSDP auto wrap policy operates on the level of modules, not\r\nparameters, there is no way to modify the auto wrap policy to wrap the\r\nDoRA parameter, it must be its own module.\r\n\r\nIf not for this reason, #1797 would be preferable, since the number of\r\ncode changes is smaller overall. In this PR, there are more numerous\r\nchanges, but the majority only involves moving code around, not any\r\nactual code changes.\r\n\r\nSince we introduce a new submodule, an extra steps are required to\r\nensure that old DoRA state dicts can still be loaded correctly. This\r\ninvolves a fairly trivial extra remapping step in\r\nset_peft_model_state_dict. The test for this is performed via the new\r\nregression DoRA tests introduced in #1792.\r\n\r\nSimilarly, there is a remapping step involved in\r\nget_peft_model_state_dict to ensure that when new state dicts with DoRA\r\nare saved, they still conform to the old format.\r\n\r\nAn additional required change was to make a defensive copy of the base\r\nlayer before dequantizing its weight in order to calculate the weight\r\nnorm for DoRA. Without this defensive copy, some side-effect is\r\ntriggered in FSDP that results in\r\n\r\n> ValueError: Cannot flatten integer dtype tensors\r\n\r\neven though the compute dtype of bnb is correctly set to float.\r\n\r\nCreating a fully functioning deepcopy does currently not work with 8bit\r\nBNB but there is a fix. Once the next BNB release is out, 8bit BNB will\r\nbe tested and enabled.\r\n\r\nWhile working on this, I also noticed a small bug that dropout was not\r\ncorrectly applied when using QDoRA. This is now also fixed.\r\n\r\nThis PR was tested successfully with FSDP and (Q)DoRA using the scripts\r\nin examples/sft/ with a modification to enable DoRA.","shortMessageHtmlLink":"Refactor to make DoRA and QDoRA work with FSDP (#1806)"}},{"before":"8cd2cb613bddf75e7c75e39546fefc87bc799809","after":"cb0bf077744d11524ec6f68d920f4cfe4ef3e8f3","ref":"refs/heads/main","pushedAt":"2024-05-30T13:39:26.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"MNT Remove deprecated use of load_in_8bit (#1811)\n\nDon't pass load_in_8bit to AutoModel.from_pretrained, instead use\r\nBitsAndBytesConfig.\r\n\r\nThere was already a PR to clean this up (#1552) but a slightly later\r\nPR (#1518) re-added this usage.\r\n\r\n---------\r\n\r\nCo-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>","shortMessageHtmlLink":"MNT Remove deprecated use of load_in_8bit (#1811)"}},{"before":"e7b75070c72a88f0f7926cc6872858a2c5f0090d","after":"8cd2cb613bddf75e7c75e39546fefc87bc799809","ref":"refs/heads/main","pushedAt":"2024-05-30T10:37:18.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"CI Make torch compile tests run on GPU (#1808)\n\nMany of these tests require a GPU to run, so use custom runners.\r\n\r\nCode was mostly copied from existing workflows.\r\n\r\n---------\r\n\r\nCo-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>","shortMessageHtmlLink":"CI Make torch compile tests run on GPU (#1808)"}},{"before":"1b262167f39b5f454624180bf01947a7e2ba1d65","after":"e7b75070c72a88f0f7926cc6872858a2c5f0090d","ref":"refs/heads/main","pushedAt":"2024-05-28T09:36:38.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"TST: Add simple BNB regression tests (#1602)\n\nThese are very basic and simplistic regression tests for bnb. Their\r\npurpose is to ensure that there is no unnoticed change in bnb that leads\r\nto different outputs. There is no check for \"correctness\", just that the\r\nresults haven't changed.\r\n\r\nEventually, this workflow should be improved and moved to the bnb repo.\r\n\r\n---------\r\n\r\nCo-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>","shortMessageHtmlLink":"TST: Add simple BNB regression tests (#1602)"}},{"before":"4fbc7a1eb14095619c0ba3fd25d6d0ef438bf1da","after":null,"ref":"refs/heads/fix-lora-merge-docs","pushedAt":"2024-05-28T09:15:46.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"}},{"before":"39c60ffca9c1d1cc606a16654cfe9cd66b363a70","after":"1b262167f39b5f454624180bf01947a7e2ba1d65","ref":"refs/heads/main","pushedAt":"2024-05-28T09:13:44.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"younesbelkada","name":"Younes Belkada","path":"/younesbelkada","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49240599?s=80&v=4"},"commit":{"message":"Docs / LoRA: Add more information on `merge_and_unload` docs (#1805)\n\n* put back lora merging diagram\r\n\r\n* push\r\n\r\n* Update docs/source/developer_guides/lora.md\r\n\r\nCo-authored-by: Benjamin Bossan \r\n\r\n---------\r\n\r\nCo-authored-by: Benjamin Bossan ","shortMessageHtmlLink":"Docs / LoRA: Add more information on merge_and_unload docs (#1805)"}},{"before":"30e87b9f6b537cea41bb9f24888af0743beaa383","after":"4fbc7a1eb14095619c0ba3fd25d6d0ef438bf1da","ref":"refs/heads/fix-lora-merge-docs","pushedAt":"2024-05-28T09:09:54.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"younesbelkada","name":"Younes Belkada","path":"/younesbelkada","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49240599?s=80&v=4"},"commit":{"message":"Update docs/source/developer_guides/lora.md\n\nCo-authored-by: Benjamin Bossan ","shortMessageHtmlLink":"Update docs/source/developer_guides/lora.md"}},{"before":"889b6f03ac1db65cf1fc1e4e6cfc33cd5a1e6b2a","after":"30e87b9f6b537cea41bb9f24888af0743beaa383","ref":"refs/heads/fix-lora-merge-docs","pushedAt":"2024-05-28T08:50:15.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"younesbelkada","name":"Younes Belkada","path":"/younesbelkada","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49240599?s=80&v=4"},"commit":{"message":"push","shortMessageHtmlLink":"push"}},{"before":null,"after":"889b6f03ac1db65cf1fc1e4e6cfc33cd5a1e6b2a","ref":"refs/heads/fix-lora-merge-docs","pushedAt":"2024-05-28T08:48:47.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"younesbelkada","name":"Younes Belkada","path":"/younesbelkada","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49240599?s=80&v=4"},"commit":{"message":"put back lora merging diagram","shortMessageHtmlLink":"put back lora merging diagram"}},{"before":"8304017a9a57175125f04a1275b7833de722a4b5","after":"39c60ffca9c1d1cc606a16654cfe9cd66b363a70","ref":"refs/heads/main","pushedAt":"2024-05-27T10:00:47.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"TST Add regression test for DoRA, VeRA, BOFT, LNT (#1792)\n\nThese new methods were added but the regression tests were not extended\r\nyet. This PR adds regression tests for these methods. The regression\r\nartifacts have been pushed based on PEFT v0.11.1. The new tests pass\r\nlocally.","shortMessageHtmlLink":"TST Add regression test for DoRA, VeRA, BOFT, LNT (#1792)"}},{"before":"b2922565c4c4445706a87cf7b988c828b451fe61","after":"8304017a9a57175125f04a1275b7833de722a4b5","ref":"refs/heads/main","pushedAt":"2024-05-27T08:12:22.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"FIX BOFT device error after PR 1742 (#1799)\n\nPR #1742 introduced the feature that adapters of the same layer can be\r\non different devices. A new method was introduced that is responsible\r\nfor moving the parameters related to a specific adapter in a consistent\r\nway.\r\n\r\nIn BOFT, however, one parameter was overlooked, boft_P. This parameter\r\nis not stored inside a ParameterDict or ModuleDict, hence it was not\r\nmoved. The reason is (presumably) that this parameter is shared between\r\nall BOFT adapters, as it's always identical. However, this clashes with\r\nhaving different adapters on different devices.\r\n\r\nTo solve this, the parameter is now moved on the fly to the correct\r\ndevice during the forward pass.","shortMessageHtmlLink":"FIX BOFT device error after PR 1742 (#1799)"}},{"before":"3cf5359f112fedae2ffd28412cfc95076263e5d3","after":"b2922565c4c4445706a87cf7b988c828b451fe61","ref":"refs/heads/main","pushedAt":"2024-05-23T14:12:57.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"TST Install bitsandbytes for compile tests (#1796)\n\nAlso, remove outdated comment.","shortMessageHtmlLink":"TST Install bitsandbytes for compile tests (#1796)"}},{"before":"cb7aedd9ba6642dda543d176ead5b5247d112e2e","after":"3cf5359f112fedae2ffd28412cfc95076263e5d3","ref":"refs/heads/main","pushedAt":"2024-05-23T08:54:41.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"FIX Allow same layer adapters on different devices (#1742)\n\nThe issue is that so far, we made the assumption in PEFT that all\r\nadapter weights of a specific layer are on the same device. There can be\r\ncases where it is useful to have adapters on different devices. E.g.\r\nwhen a user loads a lot of LoRA adapters and wants to offload those not\r\ncurrently in use to CPU, they would not currently be able to do so.\r\n\r\nWith this PR, we add this possibility. To achieve this, when we update\r\nan adapter layer with a new adapter, we only move that specific adapter\r\nto the device of the base layer, will not touching the other loaded\r\nadapters.\r\n\r\nWhile working on this, I discovered a small bug in VeRA when adding\r\nmultiple adapters, which is now also fixed.","shortMessageHtmlLink":"FIX Allow same layer adapters on different devices (#1742)"}},{"before":"47745d57c2ab110ce854f76b279ac03ead63c12c","after":"cb7aedd9ba6642dda543d176ead5b5247d112e2e","ref":"refs/heads/main","pushedAt":"2024-05-23T06:07:30.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"pacman100","name":"Sourab Mangrulkar","path":"/pacman100","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/13534540?s=80&v=4"},"commit":{"message":"fix docs (#1793)","shortMessageHtmlLink":"fix docs (#1793)"}},{"before":null,"after":"cd61ec5ef02fe34412ace3dacf2af260a1b275ca","ref":"refs/heads/smangrul/doc-fix","pushedAt":"2024-05-22T15:13:23.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"pacman100","name":"Sourab Mangrulkar","path":"/pacman100","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/13534540?s=80&v=4"},"commit":{"message":"fix docs","shortMessageHtmlLink":"fix docs"}},{"before":"1fec23152ac82011c2b5924e3220381ae8a3ae78","after":"47745d57c2ab110ce854f76b279ac03ead63c12c","ref":"refs/heads/main","pushedAt":"2024-05-22T14:35:27.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"FIX Use correct attribute name for HQQ in merge (#1791)\n\nWithout this fix, test_hqq_lora_model_outputs currently fails.","shortMessageHtmlLink":"FIX Use correct attribute name for HQQ in merge (#1791)"}},{"before":"bc6a99906c09b8ecc6b10b766ef1a3382d8e9630","after":"1fec23152ac82011c2b5924e3220381ae8a3ae78","ref":"refs/heads/main","pushedAt":"2024-05-22T08:43:29.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"DOC TST Reproducibility of models using batch norm (#1734)\n\nFixes #1732\r\n\r\nAfter loading a model that was trained with PEFT on a base model with\r\nsome kind of batch norm layer, the loaded model should produce the same\r\noutput. Right now, this does not happen.\r\n\r\nThe reason is that during training, buffers for running mean etc. are\r\nupdated, but they are not saved when calling save_pretrained on the\r\nPeftModel instance. Normally in PEFT, we assume that during training,\r\nthe base model parameters are kept constant, which is not the case with\r\nbatch norm. We only save the PEFT parameters and assume that when the\r\nuser loads the base model, all parameters are restored exactly. That\r\nway, the information in the buffers is lost completely.\r\n\r\nThe fix is to add the batch norm layers to modules_to_save. This fix is\r\nnow documented and tested.","shortMessageHtmlLink":"DOC TST Reproducibility of models using batch norm (#1734)"}},{"before":"691bc22ea673a7eaac2f40db1b89b1a2d3944fc2","after":"bc6a99906c09b8ecc6b10b766ef1a3382d8e9630","ref":"refs/heads/main","pushedAt":"2024-05-21T13:45:06.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"FIX Warning abt config.json when the base model is local. (#1668)\n\nFix incorrect warning when loading local model.","shortMessageHtmlLink":"FIX Warning abt config.json when the base model is local. (#1668)"}},{"before":"fb7f2796e5411ee86588447947d1fdd5b6395cad","after":"691bc22ea673a7eaac2f40db1b89b1a2d3944fc2","ref":"refs/heads/main","pushedAt":"2024-05-21T13:35:51.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"ENH Layer/model status shows devices now (#1743)\n\nFor each adapter, show all the devices of this adapter's parameters.\r\n\r\nAlso, while working on this, found a very minor bug in VeRA as its\r\nlinear layer didn't implement its own __repr__.","shortMessageHtmlLink":"ENH Layer/model status shows devices now (#1743)"}},{"before":"4e32679f37f52e85797faedfc146f80a47a3599a","after":"fb7f2796e5411ee86588447947d1fdd5b6395cad","ref":"refs/heads/main","pushedAt":"2024-05-17T16:59:22.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"pacman100","name":"Sourab Mangrulkar","path":"/pacman100","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/13534540?s=80&v=4"},"commit":{"message":"Add add_weighted_adapter to IA3 adapters (#1701)\n\n* Add add_weighted_adapter to IA3 adapters\r\n\r\n* Refactor to simplify code\r\n\r\n* refactor test\r\n\r\n* Add IA3 merging docs\r\n\r\n* Update docs/source/developer_guides/model_merging.md\r\n\r\nCo-authored-by: Benjamin Bossan \r\n\r\n* Update docs/source/developer_guides/model_merging.md\r\n\r\nCo-authored-by: Benjamin Bossan \r\n\r\n* address PR feedback\r\n\r\n---------\r\n\r\nCo-authored-by: Benjamin Bossan ","shortMessageHtmlLink":"Add add_weighted_adapter to IA3 adapters (#1701)"}},{"before":"3f7aacd6015451506ba3b85ba8582e8076b6cced","after":"4e32679f37f52e85797faedfc146f80a47a3599a","ref":"refs/heads/main","pushedAt":"2024-05-17T16:03:27.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"TST: torch compile tests (#1725)\n\nRight now, we don't have specific tests for torch.compile. Instead, we\r\nhave a \"hack\" that allows to run _all_ tests with torch.compile if we\r\nset the environment variable PEFT_DEBUG_WITH_TORCH_COMPILE=1.\r\n\r\nThis is not very practical because it takes a lot of time to run all\r\nthese tests with compilation enabled. Also, currently hundreds of tests\r\nare failing, which makes it impossible to understand more closely what\r\ndoes or does not work.\r\n\r\nThis PR removes the aforementioned \"hack\" and instead replaces it with a\r\nlist of explicit torch.compile tests. Currently, these tests cover\r\ntraining/inference with a bunch of different tuner types, as well as\r\nmore advanced features with LoRA (e.g. quantization, multiple adapters,\r\netc.).\r\n\r\nSome of these tests pass and some of them fail. This is documented now,\r\nso that users can quickly look up if their use case would be compatible\r\nwith torch.compile. This is very useful to have, because sometimes\r\ntorch.compile may appear to work but actually returns the wrong result.\r\nFor users, it's not immediately obvious when this happens.\r\n\r\nThe test suite is not exhaustive, there are many combinations of\r\nfeatures that could be added. However, it should be a good starting\r\npoint and can be extended later.\r\n\r\nThe test suite does _not_ cover whether torch.compile actually\r\naccelerates the code. This may not be the case even if it works\r\ncorrectly (e.g. because of graph breaks). Testing this would require\r\nbigger models and more data, which is prohibitively slow to test.\r\n\r\n---------\r\n\r\nCo-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>\r\nCo-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>","shortMessageHtmlLink":"TST: torch compile tests (#1725)"}},{"before":"e3eeabfad2ed813ef6fba2661bd5a73282871814","after":"3f7aacd6015451506ba3b85ba8582e8076b6cced","ref":"refs/heads/main","pushedAt":"2024-05-17T13:37:30.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"Bump version to 0.11.2.dev0 (#1741)\n\nAfter patch release of 0.11.1.","shortMessageHtmlLink":"Bump version to 0.11.2.dev0 (#1741)"}},{"before":null,"after":"207376de6282977844eae58e4a8afc6406d7751f","ref":"refs/heads/patch-release-0.11.1","pushedAt":"2024-05-17T10:50:15.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"Release v0.11.1","shortMessageHtmlLink":"Release v0.11.1"}},{"before":"ae1ae20b768d8bafc1b7660f4b8153033e684c32","after":"e3eeabfad2ed813ef6fba2661bd5a73282871814","ref":"refs/heads/main","pushedAt":"2024-05-17T10:43:03.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"FIX BOFT setting env vars breaks C++ compilation (#1739)\n\nResolves #1738","shortMessageHtmlLink":"FIX BOFT setting env vars breaks C++ compilation (#1739)"}},{"before":"2535036c240be57a656b86b72d96ce6da536e41b","after":"ae1ae20b768d8bafc1b7660f4b8153033e684c32","ref":"refs/heads/main","pushedAt":"2024-05-16T15:11:36.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"Autocast adapter weights if fp16/bf16 (#1706)\n\nAs discussed internally, we want to automatically cast the weights of\r\nthe adapter to float32 when using float16. Float16 is not conducive to\r\nstable training and raises errors when used with AMP.\r\n\r\nPreviously, we had to recommend to users to manually cast the weights\r\nif they loaded the base model in float16, because PEFT would choose the\r\nsame dtype for the adapter as for the base weights. Forgetting this is a\r\ncommon source of errors, so we choose to automate this.\r\n\r\nIf this causes trouble, users can prevent the behavior by passing\r\nautocast_adapter_dtype=False to get_peft_model,\r\nPeftModel.from_pretrained, or PeftModel.load_adapter.\r\n\r\nThis PR should be reviewed carefully, as it has the potential to break\r\nexisting code if something important was missed. We also need to add a\r\nnote for the upcoming release text about this change in behavior.","shortMessageHtmlLink":"Autocast adapter weights if fp16/bf16 (#1706)"}},{"before":"e003ae78506f316fb0e6a97e6132cb4c590a47ab","after":"2535036c240be57a656b86b72d96ce6da536e41b","ref":"refs/heads/main","pushedAt":"2024-05-16T14:27:53.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"ENH Save and load base model with revision (#1658)","shortMessageHtmlLink":"ENH Save and load base model with revision (#1658)"}},{"before":"0649947396a946a9333b731bea2e76551e85ad92","after":"e003ae78506f316fb0e6a97e6132cb4c590a47ab","ref":"refs/heads/main","pushedAt":"2024-05-16T10:34:29.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"Bump version to 0.11.1.dev0 (#1736)","shortMessageHtmlLink":"Bump version to 0.11.1.dev0 (#1736)"}},{"before":"b5acf5d6be27cc29e3261a9dab4ca6644e5b3f69","after":"0649947396a946a9333b731bea2e76551e85ad92","ref":"refs/heads/main","pushedAt":"2024-05-16T09:41:41.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"Release: v0.11.0 (#1733)","shortMessageHtmlLink":"Release: v0.11.0 (#1733)"}},{"before":"748f7968f3a31ec06a1c2b0328993319ad9a150a","after":"b5acf5d6be27cc29e3261a9dab4ca6644e5b3f69","ref":"refs/heads/main","pushedAt":"2024-05-15T09:35:39.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"Add PiSSA as an initialization method of LoRA (#1626)\n\nImplements https://huggingface.co/papers/2404.02948.","shortMessageHtmlLink":"Add PiSSA as an initialization method of LoRA (#1626)"}},{"before":"47b3712898539569c02ec5b3ed4a6c36811331a1","after":"748f7968f3a31ec06a1c2b0328993319ad9a150a","ref":"refs/heads/main","pushedAt":"2024-05-14T15:10:23.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"BenjaminBossan","name":"Benjamin Bossan","path":"/BenjaminBossan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6229650?s=80&v=4"},"commit":{"message":"FIX Allow DoRA init on CPU when using BNB (#1724)\n\nResolves #1674\r\n\r\nFor some users, it is necessary to initialize the model on CPU, even\r\nwhen using BitsAndBytes, which requires a GPU eventually. Since DoRA\r\nrequires to dequantize the BNB weights at initialization, we need to\r\ntemporarily move the model corresponding weights to GPU. After\r\ndequantization, the weights are moved back to CPU.","shortMessageHtmlLink":"FIX Allow DoRA init on CPU when using BNB (#1724)"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEWSOnkAA","startCursor":null,"endCursor":null}},"title":"Activity ยท huggingface/peft"}