Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve context by providing a repo map; Add Vertex AI integration; Add Greedy action parser for more robust code block action syntax #373

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Henri-Laiho
Copy link

@Henri-Laiho Henri-Laiho commented May 17, 2024

I've been experimenting with the SWE-Agent and made some improvements on the go that I thought would be worth considering adding to the project

What does this implement?

  • Repo map implementation from Aider
  • Google Vertex AI integration
  • A greedy thought-action parser to handle files that contain triple backticks (```)

Repo map

When using the agent on an existing codebase I encountered the issue where the agent starts to reimplement existing code, to give the LLM context for avoiding this I have added the repo map code from Aider
The Repo map feature adds a snippet like this to the first user message

Here is a map of your working directory's latest state, showing files and signatures in code files:

.gitignore

readme.md

requirements.txt

sonoff_diy_api/smart_socket_mock.py:
│class SmartSocketMock:
│    def __init__(self):
│        # Initialize device state and attributes
│        self.switch_state = "off"  # Initial state of the switch
│        self.startup_state = "stay"  # Power on state
│        self.signal_strength = -70  # WiFi signal strength
│        self.pulse_state = "off"  # Inching state
│        self.pulse_width = 500  # Inching pulse width
│        self.ssid = "sonoffDiy"  # WiFi SSID
│        self.password = "20170618sn"  # WiFi password
⋮...
│    def switch(self, switch_state):
⋮...
│    def startup(self, startup_state):
⋮...
│    def signal_strength(self):
⋮...
│    def pulse(self, pulse_state, pulse_width=None):
⋮...
│    def wifi(self, ssid, password):
⋮...
│    def ota_unlock(self):
⋮...
│    def ota_flash(self, download_url, sha256sum):
⋮...
│    def info(self):
⋮...

Google Vertex AI integration

Added a Vertex AI integration for access to 1M context window models

Greedy parser

When the agent tried to write documentation in markdown files I encountered the issue of the action parser failing to read file with triple backtick (```) blocks. To fix this I added the ThoughtActionGreedyParser class that parses the string between the outer backticks as the only action, as multiple action outputs seem to be unsupported. I know that a workaround would be using an xml-agent, but that might be too complex syntax for smaller LLMs

Any other comments?

I hope any of these features would be of use. Let me know if you would like me to cherry-pick and polish any of them to a separate PR

I have not had the time to write tests and make sure these features smoothly integrate with everything else

🧡 Thanks for contributing!

@ofirpress
Copy link
Member

Hi!
Thanks for making a PR. Have you run any of these 3 new features on SWE-bench to see if they improve the accuracy?

@Henri-Laiho
Copy link
Author

Hi! Thanks for making a PR. Have you run any of these 3 new features on SWE-bench to see if they improve the accuracy?

Hi!
No, I haven't run on SWE-bench yet. I opened the PR to share the code I added during my testing, and could be helpful to the project.
I will run on SWE-bench once I have more time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants