Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to add custom Actions that the agent can perform #394

Closed
VarunNair31 opened this issue Jun 28, 2024 · 6 comments
Closed

Ability to add custom Actions that the agent can perform #394

VarunNair31 opened this issue Jun 28, 2024 · 6 comments

Comments

@VarunNair31
Copy link

Ability to perform custom actions with Drivers

Solution
There should be a way to add custom actions that we want to perform. Some websites may require us to do specific actions that may or may not be supported via selenium or playwright or some actions that are available in selenium but not executable by the Agent.

If there is a way already please let me know 🤗

@dhuynh95
Copy link
Collaborator

Sure! Do you have any specific actions in mind, by the way? Selenium or Playwright are quite exhaustive.
Are you asking this because the agent failed in a specific scenario? We haven't implemented everything yet to have great coverage (shadow DOM and iframe will be handled soon).
By "action" do you mean an atomic action, like "Click on button" or a sequence, like "Get the next meeting on my calendar", which might be translated into "Click on Calendar", "Click on 10 am meeting"?

@VarunNair31
Copy link
Author

VarunNair31 commented Jun 28, 2024

The agent failed while i was trying to execute a test case which required me to double click on an element. It gives me the following error:
name ActionChains is not defined.
Selenium does have this capability, I'm just not sure if i can add it to Lavague on my end using custom methods.

@adeprez
Copy link
Contributor

adeprez commented Jun 28, 2024

Currently, the available actions are limited to those implemented in the drivers. We haven't yet considered double-click actions. Could you share your use case with us? I wasn't aware that double-clicks were commonly used in web interfaces.

We can certainly add double-click to the list of available actions. Would you like to contribute to this enhancement? It involves adding the necessary code to the exec_code function from the Selenium Driver, and documenting its usage within the prompt template.

@VarunNair31
Copy link
Author

The use case is basically to open a folder in our internal application. I would be happy to contribute but currently i wont be able to takeout any free time from my schedule. I will be sure to take some time and contribute.

@dhuynh95
Copy link
Collaborator

dhuynh95 commented Jul 1, 2024

Is a double click truly needed versus a simple click?
Is it some kind of legacy web app?
We can provide the work, but we would need to know better what you want to achieve. If you are open, you can ping me on Discord and we can schedule a quick call to understand where you are at and help you

@lyie28
Copy link
Contributor

lyie28 commented Jul 1, 2024

Just to add details on how to modify the driver source code to add custom actions. You can do so with the following steps:

  1. Open the base.py file for your driver, e.g lavague-integrations/drivers/lavague-drivers-selenium/lavague/drivers/selenium/base.py

  2. Define your action in the list of actions in the SELENIUM_PROMPT_TEMPLATE below the The actions available are: line.

For example, you might add a new clearValue action:

Name: clearValue
Description: Focus on and clear the text of an input element with a specific xpath
Arguments:
  - xpath (string)
  1. Add an elif clause to the exec_code method to handle your new action.
elif action_name == "clearValue":
                self.clear_value(
                    item["action"]["args"]["xpath"]
                )
  1. Finally add your new method for handling this action:
def clear_value(self, xpath: str,):
        elem = self.page.locator(f"xpath={xpath}").first
        elem.clear()
  1. Install your driver from your local package:
    pip install -e lavague-integrations/drivers/lavague-drivers-selenium

@lyie28 lyie28 closed this as completed Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants