Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes to not have to reinstall testbeds and conda envs #109

Closed
wants to merge 1 commit into from

Conversation

aorwall
Copy link

@aorwall aorwall commented Apr 28, 2024

Reference Issues/PRs

Partly solves #104

What does this implement/fix? Explain your changes.

This change is to make it possible to reuse conda environments when running evaluation.

  • The conda path couldn't be set in run_evaulation.py and was therefore not used in the context_manager.
  • The conda env list command wasn't properly parsed so the existing environments was detected

@john-b-yang
Copy link
Collaborator

@aorwall thanks so much for the original contribution and patience.

While we didn't merge the original code, we accounted for this feature in the new swebench=2.0.0 release.

The docker image caching mechanism takes care of this:

  • Control how the harness caches images between runs
  • You can cache three different tiers of images: base, environment, and instance. For running on full SWE-bench test, the number of images associated w/ each tier are as follows:
    • base: 1 image that is the base image which all instances are built from
    • environment: 60 images w/ conda environments that all together cover any and all environments used by instances
    • instance: 2294 images (one per instance), which is just env image + installation of repository at base_commit of instance

The report has more advice on how to appropriately set the cache level.

In a nutshell, this feature has been incorporated in swebench>=2.0.0. Now, with enough storage, instance-specific images can be cached, allowing 2+ evaluation runs of SWE-bench to be completed very quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants