-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Best practice for train and validation set separation #1181
Comments
Hey @denisergashbaev , which version of DSPy are you on? In general, when using DSPy there are four data splits to keep in mind: Train, Validation, Development, and Test. (Many optimizers only take 'train' and then internally re-split it to train and validation.) In the most general case, optimizers are free to do anything with Train and Validation, because they're either 'training' (on Train) or 'hyperparameter tuning' (on Validation). Typically, optimizers should not be provided any access to Development (which you can use to tweak your algorithm), except in very low-data regimes where Validation = Development. Test is test, it's held out for final evaluation. In practice, optimizers will generally not use Validation for direct fitting, but only for blackbox optimization. This is not guaranteed but it's the case throughout every current optimizer. The instance you saw of BootstrapFewShot using validation is just a bug resulting from a using an undocumented path in the code. (Unlike other teleprompters, BootstrapFewShot is not an optimizer, it's just a meta-prompting approach, so it shouldn't even be given a validation set. The bugfix was to remove the ability to even provide a valset to it. The name This was fixed a long time ago, though, so make sure you're on a recent version of DSPy.
Which distribution do you ultimately care about? Make sure you have that in your dev set and track progress until you're satisfied. |
Hello @okhat thank you very much for your response!
I am using DSPy v2.4.9 and could reproduce the above error with it. Here is the code that I have used: from dspy.teleprompt import BootstrapFewShot
bfs_optimizer = BootstrapFewShot(metric=metric, teacher_settings=teacher_settings, max_bootstrapped_demos=3, max_labeled_demos=len(train_set), max_rounds=1, max_errors=0)
page_data_extractor = bfs_optimizer.compile(page_data_extractor, trainset=train_set, valset=val_set) If I inspect the Also, BootstrapFewShot documentation mentions valset explicitly as well.
Let me rephrase your answer to make sure I understand it properly. Could you please correct if I am wrong:
What would be your estimate for a low data regime that would necessitate validation=dev dataset? Thank you |
Omar (@okhat), could you please look at @denisergashbaev query above if you could. I happen to have a similar one. |
Hey @denisergashbaev and @Gsizm, Thanks for the note on the docs page. I've removed the mention of BootstrapFewShotWithRandomSearch, on the other hand, is an optimizer. You can and should give it separate trainset and valset. It will build examples from trainset and will score candidate programs on the valset. If you have several hundreds of examples, I recommend using a devset != valset and not passing devset to any optimizers. That way, you have a way to measure progress before you eventually evaluate on the held-out test set. That said, using valset == devset is often OK too, especially if total data is less than 200-400 examples. Only two rules are crucial:
|
I assume this is resolved. Feel free to re-open if necessary. I forgot to add that DSPy 2.4.10 should certainly not have a valset argument in BootstrapFewShot; we removed that field in April iirc. Let me know if it's still there or if you have any other thoughts or questions. |
Hello! I am interested to know, how we should approach the compilation step. I thought of the following but not sure whether it is a correct practice:
Now, some questions:
Thank you!
The text was updated successfully, but these errors were encountered: