Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed issue #894 and #858 (aws_models issues) #843

Merged
merged 5 commits into from
Jun 19, 2024

Conversation

drawal1
Copy link
Contributor

@drawal1 drawal1 commented Apr 16, 2024

EnsembledProgram can now be loaded/saved
Inference of programs (forward function) is now concurrent
Added checks for size = 0 or size > number of programs

@okhat
Copy link
Collaborator

okhat commented Apr 17, 2024

This is so cool.... I need to check the parallelization, it's a bit tricky in DSPy due to dspy.settings.

In dspy.evaluate.Evaluate we do things in a bit of a special way, which has to be applied when there's threading

@drawal1
Copy link
Contributor Author

drawal1 commented Apr 17, 2024

@okhat @arnavsinghvi11 - This PR now also includes a fix for issue #824. The fix is in teleprompt/random_search.py

@drawal1
Copy link
Contributor Author

drawal1 commented Apr 22, 2024

@okhat - any progress on this review? I am continuing to work on intelligent ensemble routing and don't want to muddy this PR with commits related to that work

btw, I have been testing ensemble concurrency and so far it works like a charm

also, I can reverse the commit for issue #824 and submit a separate PR for that, if it would expedite this PR review

@drawal1
Copy link
Contributor Author

drawal1 commented Apr 23, 2024

This is so cool.... I need to check the parallelization, it's a bit tricky in DSPy due to dspy.settings.

In dspy.evaluate.Evaluate we do things in a bit of a special way, which has to be applied when there's threading

I think I understand. I see this code in evaluate.py:

        def wrapped_program(example_idx, example):
            # NOTE: TODO: Won't work if threads create threads!
            thread_stacks = dspy.settings.stack_by_thread

So I need to test and verify the parallelization NOT in inference but as part of another optimization pipeline

@drawal1
Copy link
Contributor Author

drawal1 commented Apr 24, 2024

Fixed #894 in dsp/modules/aws_models.py, in class AWSMeta(AWSModel) by moving below functionality from init function:

        max_tokens = query_args.pop("max_tokens", None)
        if max_tokens:
            query_args["max_gen_len"] = max_tokens

to the create_body function

and copying kwargs to base_args by value

@drawal1 drawal1 changed the title Fixed issue #775 and refactored Ensemble optimizer Fixed issue #775, #824, #894 and refactored Ensemble optimizer Apr 25, 2024
@drawal1
Copy link
Contributor Author

drawal1 commented Apr 25, 2024

@okhat - I have reverted the parallelization in EnsembledProgram.forward() based on your review and feedback.

The remaining fixes/improvements are targeted and should not have side-effects

for seed in range(-3, self.num_candidate_sets):
if (restrict is not None) and (seed not in restrict):
if seed < 0 and (restrict is not None) and (seed not in restrict):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we skipping over the zero-shot, labeled_fewshot and unshuffled_fewshot cases for all scenarios? I think even in the conditions above when a teacher is set, it is fine to run through these preliminary candidates and have it weighted in the scores.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough. reverted to previous behavior

@@ -96,7 +117,8 @@ def compile(self, student, *, teacher=None, trainset, valset=None, restrict=None
program2 = program.compile(student, teacher=teacher, trainset=trainset2)

else:
assert seed >= 0, seed
if seed < 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove this based on decision above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert fails Ruff check (S101 Use of assert detected). Lets keep the raise exception here. its good practice

pyproject.toml Outdated
@@ -278,7 +284,14 @@ ignore = [
"E731",
# Sometimes we need List and Tuple
"UP006",
# optimzer are using the 'compile' method which is a built-in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

pyproject.toml Outdated
# commented-out code should be allowed
"ERA001",
# allow pickle
"S301",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the purpose of these changes? I think the ruff has been working correctly so far but just curious

Copy link
Contributor Author

@drawal1 drawal1 May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERA001 because Ruff complains about commented out print() statements. I often find these useful for debugging

Re. S301, ensemble.py load_folder/save_folder use pickle for serialization. Ruff flags pickle as unsafe for load because it could load arbitrary objects. I suppose we could switch to https://jsonpickle.github.io/. If jsonpickle is acceptable, I can make the change and remove the S301.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnavsinghvi11 - what's your opinion on jsonpickle? see my comment above...

@arnavsinghvi11
Copy link
Collaborator

Thanks @drawal1 for the PR! left a few comments. The changes to Ensemble look correct to me, but I think the ones for RandomSearch are not needed? or can be handled separately without impacting existing behavior. lmk what you think.

There's also a merge conflict to resolve and feel free to remove extraneous comments from the changes that do not help with following the code. Thanks!

tagging @okhat for reference on Ensemble/RandomSearch again (but I believe the parallelization changes are removed so should be good there).

@drawal1 drawal1 reopened this Jun 3, 2024
@drawal1 drawal1 changed the title Fixed issue #775, #824, #894 and refactored Ensemble optimizer Fixed issue #894 and #858 (aws_models issues) + #824 (crash using precompiled teacher) Jun 3, 2024
@drawal1
Copy link
Contributor Author

drawal1 commented Jun 3, 2024

@arnavsinghvi11 - I have reverted ALL the complicated changes. Hopefully now this PR is acceptable

@arnavsinghvi11
Copy link
Collaborator

Hi @drawal1 , thanks for the changes. can you also revert the changes to BootstrapWithRandomSearch since those changes are not relevant to patching #858 .

@drawal1 drawal1 changed the title Fixed issue #894 and #858 (aws_models issues) + #824 (crash using precompiled teacher) Fixed issue #894 and #858 (aws_models issues) Jun 18, 2024
@drawal1
Copy link
Contributor Author

drawal1 commented Jun 18, 2024

@arnavsinghvi11 - BootstrapWithRandomSearch changes are now reverted

@arnavsinghvi11 arnavsinghvi11 merged commit 4379ead into stanfordnlp:main Jun 19, 2024
4 checks passed
@arnavsinghvi11
Copy link
Collaborator

Thanks @drawal1 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants