Fix the bug of lm in predict #774

minfawang · 2024-04-04T22:31:13Z

In l71 of predict.py, there is a logic to use the lm from the kwargs: lm = kwargs.pop("lm", self.lm) or dsp.settings.lm, but the generate call does not use it. This PR fixes the bug by using the lm as the context.

Also, the previous comment stated that it's unclear why query_only should ever be used, and this is the only place the query_only option is used. So it sounds like a good choice to just get rid of this option and simplify the code logic.

Reference:

self.lm in generate was introduced on 08/24/2023
lm kwargs was introduced on 10/11/2023

arnavsinghvi11 · 2024-04-11T06:01:34Z

Hi @minfawang , thanks for checking up on this. the lm is actually used as the line if self.lm is None defaults to using the dspy.settings.lm but internally when self.lm is set, we simply configure dspy.settings.lm using the context manager.

As for the query_only behavior, it should be fine to leave as is in case there are use cases that arise to only have the query!

Feel free to comment otherwise if you think this PR is still needed. Thanks!

minfawang · 2024-04-13T15:32:35Z

Thanks. The major glitch to fix is getting lm from the kwargs. See this part lm = kwargs.pop("lm", ...). This suggests that the API allows to pass a custom lm when the forward method is called. And if you look at l76, the temperature value is overwritten by that lm configuration. But this lm variable is ignored later when generate is called.

arnavsinghvi11 · 2024-04-15T02:43:39Z

hmm I'm still a bit confused. Could you share an example perhaps that illustrates this issue? To my understanding, the current logic ensures the correct LM is loaded in, and we don't need to touch query_only for now.

minfawang · 2024-04-15T16:13:40Z

Below is an example:

lm1 = dspy.OpenAI(model='gpt-3.5-turbo-0125', api_key=OPENAI_API_KEY)
lm2 = dspy.OpenAI(model='gpt-4-0125-preview', api_key=OPENAI_API_KEY)
dspy.settings.configure(lm=lm1)  # <-- default to lm1


#Define a simple signature for basic question answering
class BasicQA(dspy.Signature):
    """Answer questions with short factoid answers."""
    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

#Pass signature to ChainOfThought module
generate_answer = dspy.ChainOfThought(BasicQA)

# Call the predictor on a particular input.
question='What is the color of the sky?'
pred1 = generate_answer(question=question)  # <-- Use the default lm1
pred2 = generate_answer(question=question, lm=lm2)  # <-- lm2 temperature is used, but lm2 is ignored in generation

The query_only part is not the key. I'm happy to revert the query_only change, which means to only apply query_only if self.lm is used, but turn off query_only if kwargs['lm'] or settings.lm is used. IMO the condition is a bit awkward, but I'm fine to do it for backward compatibility.

arnavsinghvi11 · 2024-04-16T01:46:22Z

@minfawang This will actually work using the DSPy context manager. Feel free to reference this documentation on using multiple LMs together.

If you run with dspy.context(lm= lm2): instead of specifying it in the generate_answer config, this will resolve the issue. lmk if that helps.

minfawang · 2024-04-16T02:15:56Z

Thanks. I understand that I could use dspy.context to achieve the desirable effect. But generate_answer(question=question, lm=lm2) is legal, yet it results in inconsistency: the function will invoke lm1, using the temperature and num_generations from lm2 (see l76-l80).

I think we should:

either support the lm kwarg generate_answer(..., lm=lm2), and adopt a fix like this
or completely ignore lm kwarg, and fix l71, l76-l80, making temperature and num_generations independent on lm kwarg

Are you more in favor of 2)?

arnavsinghvi11 · 2024-04-16T02:50:45Z

I think the change you've proposed (1) is making more sense to me now because we don't want to ignore the lm kwarg if passed in, but I'm realizing self.lm actually isn't really updated anywhere except BootstrapFinetune where we assign the predictor.lm as the fine-tuned model which also sets query_only in the comment.

I don't believe this is needed but I'll confirm. If not, we can merge this PR with the changes as is. Thanks for sharing the examples - really helpful to dive deeper to isolate the use cases!

minfawang · 2024-04-16T06:39:55Z

Thanks for your patience! The plan sounds good.

okhat · 2024-04-17T19:44:32Z

Please don't merge this yet. It's missing some context which I'll add asap

minfawang added 2 commits April 4, 2024 15:18

Use the correct lm in predict

eb1ac0f

Clean up the query_only option and actually use lm

c9468ab

arnavsinghvi11 mentioned this pull request Jun 19, 2024

Predict forward() does not use lm in kwargs #1167

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix the bug of lm in predict #774

Fix the bug of lm in predict #774

minfawang commented Apr 4, 2024

arnavsinghvi11 commented Apr 11, 2024

minfawang commented Apr 13, 2024

arnavsinghvi11 commented Apr 15, 2024

minfawang commented Apr 15, 2024

arnavsinghvi11 commented Apr 16, 2024

minfawang commented Apr 16, 2024

arnavsinghvi11 commented Apr 16, 2024

minfawang commented Apr 16, 2024

okhat commented Apr 17, 2024

Fix the bug of lm in predict #774

Are you sure you want to change the base?

Fix the bug of lm in predict #774

Conversation

minfawang commented Apr 4, 2024

arnavsinghvi11 commented Apr 11, 2024

minfawang commented Apr 13, 2024

arnavsinghvi11 commented Apr 15, 2024

minfawang commented Apr 15, 2024

arnavsinghvi11 commented Apr 16, 2024

minfawang commented Apr 16, 2024

arnavsinghvi11 commented Apr 16, 2024

minfawang commented Apr 16, 2024

okhat commented Apr 17, 2024