MIPRO optimizer updates for paper release #1169

XenonMolecule · 2024-06-18T14:52:57Z

Updates the MIPRO optimizer to add in program aware proposal, minibatching, and refactor to dspy.propose pattern in line with the paper release: https://arxiv.org/abs/2406.11695

Also updates the MIPRO HotPotQA notebook and creates a ScoNe notebook. Note that neither notebook has a functioning cache yet.

Old MIPRO is preserved temporarily in mipro_optimizer_deprecated.py

XenonMolecule · 2024-06-18T15:01:42Z

Note that this PR makes breaking changes to the MIPRO optimizer and is not backwards compatible.

Also switch all the old test cases back

XenonMolecule · 2024-06-18T15:33:41Z

Instead of outright breaking old MIPRO, I instead deprecated it. TODO: Make test cases compatible with new MIPRO.

New MIPRO has been renamed to MIPROv2.

rawwerks · 2024-06-18T17:15:04Z

can't wait! this is going to be so awesome.

mikeedjones · 2024-06-18T17:35:16Z

This MR includes a reversion of #920

I made some other suggestions in #918 on how the "extend gen" logic might affect programs and how that could be better documented

mikeedjones · 2024-06-18T17:34:44Z

dsp/primitives/predict.py

+ new_kwargs = {
+ **kwargs,
+ "max_tokens": max_tokens,
+ "n": 1,
+ "temperature": 0.0,
+ }


setting n=1 here causes a regression on issues #914 and #749

XenonMolecule · 2024-06-19T04:13:59Z

Hey Mike, thanks for looking at this! I reverted predict.py back because it was causing a few test programs I ran to crash, not to mention the MIPRO optimizer itself. This was specifically due to the max_depth assertion. We’ve been developing off an old branch of the repo so when I brought everything up to date it seemed like an issue in the predictor logic. I’m happy to remove my changes to predict.py, but do you have suggestions for handling the case where the LM doesn’t generate all fields? Is it a matter of changing the kwargs to predict to increase the max_depth?

mikeedjones · 2024-06-19T05:50:29Z

My guess is the crash would be caused by the LM skipping a field in one of the candidate generations and that skip not being picked up by the extend generation logic. Then, because the LM has moved on to another filed in the signature it would be very unlikely to go back and fill the earlier field, independently of how many levels of recursion.

Will to do some investigation as to why - the extend generation logic should roll back to the first skipped field so if the above is the behaviour, then it's a bug in #920 's implementation.

As to why that is never seen in the old logic - I'm stumped. I tried to replicate as closely as possible the logic in there. Something else to figure out!

EDIT: Added some tests in #1172 to try and define how the extend generation logic ought to work - neither main or this branch's implementation passes - but I think those tests cover how I expected the logic to behave.

Looking into it a little more in #1172 (comment)

mikeedjones · 2024-06-19T05:59:35Z

dspy/propose/grounded_proposer.py

+ # if self.program_aware: # Use full program demo
+ # for pred_key in demo_candidates.keys(): # Go through each predictor
+ # for example in demo_candidates[pred_key][demo_set_i]:
+ # if "augmented" in example.keys():
+ # fields_to_use = get_signature(program.predictors()[pred_key]).fields
+ # example_string = create_example_string(fields_to_use, example)
+ # task_demos += f"{example_string}\n"
+ # curr_demos_num+=1
+ # if curr_demos_num>=max_demos:
+ # break
+ # else:


should this be left in?

mikeedjones · 2024-06-19T05:59:52Z

dspy/propose/utils.py

+# def parse_list_of_instructions(instruction_string):
+# # Convert the string representation of a list to an actual list
+# try:
+# instructions = json.loads(instruction_string)
+# except json.JSONDecodeError as e:
+# print(f"Error decoding JSON: {e}")
+# return []
+
+# return instructions
+def parse_list_of_instructions(instruction_string):
+ # Try to convert the string representation of a list to an actual list using JSON
+ try:
+ instructions = json.loads(instruction_string)
+ return instructions
+ except json.JSONDecodeError:
+ pass
+
+ # If JSON decoding fails, extract strings within quotes
+ instructions = re.findall(r'"([^"]*)"', instruction_string)
+ return instructions


which of these is correct?

mikeedjones · 2024-06-19T06:00:19Z

dspy/propose/utils.py

+# def get_program_instruction_set_string(program):
+# instruction_set_dict = {}
+# for i, pred in enumerate(program.predictors()):
+# pred_instructions = get_signature(pred).instructions
+# instruction_set_dict[f"Module {i} Instruction"] = pred_instructions
+# return '\n'.join(f"{key}: {value}" for key, value in instruction_set_dict.items())
+
+def get_program_instruction_set_string(program):
+ instruction_list = []
+ for _, pred in enumerate(program.predictors()):
+ pred_instructions = get_signature(pred).instructions
+ instruction_list.append(f"\"{pred_instructions}\"")
+ # Joining the list into a single string that looks like a list
+ return f"[{', '.join(instruction_list)}]"


can one be removed?

mikeedjones · 2024-06-19T06:01:19Z