Add option to save probabilities #289

AlexTMallen · 2023-08-24T06:48:33Z

modifies prepare_data to return a LayerData object
adds an option to save the raw probabilities, alongside the texts, to the results dir
adds train_lr_eval.csv to results dir

lauritowal · 2023-08-28T19:02:55Z

elk/evaluation/evaluate.py

@@ -31,7 +31,7 @@ def execute(self, highlight_color: Color = "cyan"):
 @torch.inference_mode()
 def apply_to_layer(
 self, layer: int, devices: list[str], world_size: int
- ) -> dict[str, pd.DataFrame]:
+ ) -> tuple[dict[str, pd.DataFrame], dict]:


Maybe you could do something similar to here: https://github.com/EleutherAI/elk/pull/259/files#diff-d13b83b80dc8fe2ae73e22669dd7a1a3167a1ae731d341fa96f03a766d877933R37
🟢

Instead of having tuple[dict[str, pd.DataFrame], dict]

But we can also leave it for now, and once we merge our pull-request it will be changed anyway

lauritowal · 2023-08-28T19:12:12Z

elk/training/train.py

+ {
+ **meta,
+ "ensembling": mode,
+ "inlp_iter": i,


what is inlp here?

Iterated nullspace projection iteration for the logistic regression model.

lauritowal · 2023-08-28T19:18:07Z

elk/training/train.py

@@ -11,7 +11,7 @@
 from simple_parsing import subgroups


The function apply_to_layer was already a bit long. I think we should refactor it a bit in a second pull-request

lauritowal · 2023-08-28T19:35:30Z

elk/training/train.py

+ get_logprobs(val_lr_credences, mode).detach().cpu()
+ )
+
+ row_bufs["train_lr_eval"].append(


the names are getting a bit confusing. Maybe we should have a subfolder

"evals" containing

trainset_lr.csv

validationset_lr.csv

trainset_ccs.csv

etc.

(Could be also included in a second pull-request) 🟢

lauritowal

I think the function apply_to_layer is getting a bit long and maybe a bit confusing, so we might want to create a refactor pull-request in a second step. Everything seems to work fine, though.

norabelrose · 2023-10-23T01:18:43Z

included in #292

AlexTMallen added 3 commits August 23, 2023 19:55

refactor prepare_data using LayerData class

2b66f62

save predictions to disk

cd0cd13

include missing changes from last commit

1477bcb

AlexTMallen requested review from norabelrose and lauritowal August 24, 2023 06:48

AlexTMallen added 2 commits August 25, 2023 21:09

set default LR penalty to 0.001

e88b264

switch from get_probs to get_logprobs

080174b

lauritowal reviewed Aug 28, 2023

View reviewed changes

lauritowal approved these changes Aug 28, 2023

View reviewed changes

rename probs to logprobs

e6def70

norabelrose closed this Oct 23, 2023

AlexTMallen deleted the save_preds branch November 2, 2023 18:23

AlexTMallen restored the save_preds branch November 2, 2023 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to save probabilities #289

Add option to save probabilities #289

AlexTMallen commented Aug 24, 2023

lauritowal Aug 28, 2023

lauritowal Aug 28, 2023

AlexTMallen Aug 29, 2023

lauritowal Aug 28, 2023

lauritowal Aug 28, 2023 •

edited

Loading

lauritowal Aug 28, 2023 •

edited

Loading

lauritowal left a comment

norabelrose commented Oct 23, 2023

Add option to save probabilities #289

Add option to save probabilities #289

Conversation

AlexTMallen commented Aug 24, 2023

lauritowal Aug 28, 2023

Choose a reason for hiding this comment

lauritowal Aug 28, 2023

Choose a reason for hiding this comment

AlexTMallen Aug 29, 2023

Choose a reason for hiding this comment

lauritowal Aug 28, 2023

Choose a reason for hiding this comment

lauritowal Aug 28, 2023 • edited Loading

Choose a reason for hiding this comment

lauritowal Aug 28, 2023 • edited Loading

Choose a reason for hiding this comment

lauritowal left a comment

Choose a reason for hiding this comment

norabelrose commented Oct 23, 2023

lauritowal Aug 28, 2023 •

edited

Loading

lauritowal Aug 28, 2023 •

edited

Loading