Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom loss functions #17

Closed
norabelrose opened this issue Feb 2, 2023 · 4 comments · Fixed by #111
Closed

Support custom loss functions #17

norabelrose opened this issue Feb 2, 2023 · 4 comments · Fixed by #111
Assignees

Comments

@norabelrose
Copy link
Member

norabelrose commented Feb 2, 2023

We all want to change the CCS loss function in various ways, so we need a flexible way of defining and specifying loss functions.

We need some sort of API, maybe a class that can be inherited from, for defining entirely new loss functions programmatically and passing them into the CCS class. We also need a small library of predefined loss functions that can be accessed by name from the command line.

The custom losses need to be able to specify what inputs they take— for example, a conjunction/disjunction consistency loss would need hidden states from N independent propositions. Prompt invariance losses will take M different variants of the same proposition. We'll need some sort of data collation logic to piece together the prompts required by the given loss and then extract the hidden states from the model.

This is a big task which should probably be split into multiple PRs.

@FabienRoger
Copy link
Collaborator

Additionally, you would want to train on multiple datasets at once? (either with different losses, or at least with different numbers of variants)

@lauritowal
Copy link
Collaborator

lauritowal commented Feb 2, 2023

Additionally, you would want to train on multiple datasets at once? (either with different losses, or at least with different numbers of variants)

You mean train one probe on multiple datasets?

@FabienRoger
Copy link
Collaborator

Exactly. It can help

  • having a purely consistency based probe (because when the number of classes vary, an easy way to be consistent is to be truthlike)
  • exploring whether you can find a direction which is truthlike according to many datasets/losses simultaneously.

@FabienRoger
Copy link
Collaborator

But if it's too complicated, maybe start by having sth that works in simpler cases?

@norabelrose norabelrose added this to the PyPI 0.2 Release milestone Feb 15, 2023
@AlexTMallen AlexTMallen self-assigned this Mar 6, 2023
@AlexTMallen AlexTMallen linked a pull request Mar 7, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants