Could there be a bug in the FT implementation #173

drd13 · 2024-02-06T13:38:21Z

I've found what I think might be a bug in the implementation of the fine-tuning baseline. If this is indeed the case, this bug would yield incorrect results when the unlearning target is longer than one token.

Using the VSCode debugger, I found that the code in ft_main.py doesn't carry-out backpropagation properly. The current version of the code passes the prompts without the targets to the model by calling model(**inputs). It then gathers the logits of all tokens in the target from the last tokens logits. This will maximise the probability of all tokens in the target immediately succedding the prompt. This is not the correct behaviour which should maximise the probability of the first token in the target being a continuation to the input and then maximising the probability of the second token in the target being a continuation to the first token in the target...

I think this issue might be in the ROME repository, where the original code came from, where I've written an issue but they haven't responded. Thanks for any assistance you may offer.

The text was updated successfully, but these errors were encountered:

pengzju · 2024-02-12T02:29:18Z

Thanks for your advice, we will modify the training paradigm of FT-L as soon as possible

pengzju · 2024-02-12T04:24:23Z

I have updated the optimization target of FT, you can refer to the latest version of the code

drd13 · 2024-02-12T09:48:21Z

Hello @pengzju. I'll mark the issue as closed (but I haven't double-checked the code). If you ever rerun the experiments in the survey paper with the fix, I would be interested in knowing how it changes relative performance of fine-tuning.

Thank you very much for your rapid response and bug fix.

pengzju · 2024-02-13T02:48:48Z

Thank you for your suggestion.
In my actual testing. Even if the optimization objective is changed, FT-L still cannot take into account both Reliability and Locality, which means that high reliability means that the weights of the model are completely damaged. High locality cannot guarantee a high editing success rate, which is still consistent with the results in our paper
😊

pengzju · 2024-02-25T03:03:06Z

Thank you for your suggestion, we have provided two implementations (objective_optimization in FT-L):

1. prompt_last: the method of ROME's (https://arxiv.org/abs/2202.05262) original paper, which calculates nll loss through the last token of the input.
1. target_new: the standard autoregressive method, using the cross-entropy loss function

You can choose the appropriate optimization goal based on your experiment settings. Welcome you to try. 😊

zxlzr · 2024-03-02T03:31:16Z

I've found what I think might be a bug in the implementation of the fine-tuning baseline. If this is indeed the case, this bug would yield incorrect results when the unlearning target is longer than one token.

Using the VSCode debugger, I found that the code in ft_main.py doesn't carry-out backpropagation properly. The current version of the code passes the prompts without the targets to the model by calling model(**inputs). It then gathers the logits of all tokens in the target from the last tokens logits. This will maximise the probability of all tokens in the target immediately succedding the prompt. This is not the correct behaviour which should maximise the probability of the first token in the target being a continuation to the input and then maximising the probability of the second token in the target being a continuation to the first token in the target...

I think this issue might be in the ROME repository, where the original code came from, where I've written an issue but they haven't responded. Thanks for any assistance you may offer.

Thank you very much for raising the issue. We actually encountered this problem in the early experiments last year, but to maintain consistency with previous work ROME, we didn't address it at the time. As @pengzju mentioned, our current approach involves splitting FT into two strategies.

prompt_last: the method of ROME's (https://arxiv.org/abs/2202.05262) original paper, which calculates nll loss through the last token of the input.
target_new: the standard autoregressive method, using the cross-entropy loss function. To differentiate and make it comparable, we're referring to it as FT-M which can achieve much better performance than the FT-L.

We'll be planning to update the survey paper on Arxiv with new experimental results soon, and we've already noted this issue in the readme. Feeling like in the future, everyone can use this FT-M technique as a strong knowledge editing baseline.

Best，

EasyEdit Team

zxlzr added the question Further information is requested label Feb 6, 2024

pengzju added a commit that referenced this issue Feb 12, 2024

fix bug for issue #173

06dca99

drd13 closed this as completed Feb 12, 2024

drd13 mentioned this issue Feb 23, 2024

about evaluation #179

Closed

zxlzr reopened this Feb 23, 2024

pengzju closed this as completed Feb 25, 2024

drd13 mentioned this issue Mar 18, 2024

Possible bug in fine-tuning baseline implementation kmeng01/rome#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could there be a bug in the FT implementation #173

Could there be a bug in the FT implementation #173

drd13 commented Feb 6, 2024

pengzju commented Feb 12, 2024

pengzju commented Feb 12, 2024

drd13 commented Feb 12, 2024

pengzju commented Feb 13, 2024

pengzju commented Feb 25, 2024

zxlzr commented Mar 2, 2024 •

edited

Loading

Could there be a bug in the FT implementation #173

Could there be a bug in the FT implementation #173

Comments

drd13 commented Feb 6, 2024

pengzju commented Feb 12, 2024

pengzju commented Feb 12, 2024

drd13 commented Feb 12, 2024

pengzju commented Feb 13, 2024

pengzju commented Feb 25, 2024

zxlzr commented Mar 2, 2024 • edited Loading

zxlzr commented Mar 2, 2024 •

edited

Loading