Ability to add custom Actions that the agent can perform #394

VarunNair31 · 2024-06-28T06:25:06Z

Ability to perform custom actions with Drivers

Solution
There should be a way to add custom actions that we want to perform. Some websites may require us to do specific actions that may or may not be supported via selenium or playwright or some actions that are available in selenium but not executable by the Agent.

If there is a way already please let me know 🤗

dhuynh95 · 2024-06-28T06:28:25Z

Sure! Do you have any specific actions in mind, by the way? Selenium or Playwright are quite exhaustive.
Are you asking this because the agent failed in a specific scenario? We haven't implemented everything yet to have great coverage (shadow DOM and iframe will be handled soon).
By "action" do you mean an atomic action, like "Click on button" or a sequence, like "Get the next meeting on my calendar", which might be translated into "Click on Calendar", "Click on 10 am meeting"?

VarunNair31 · 2024-06-28T06:32:31Z

The agent failed while i was trying to execute a test case which required me to double click on an element. It gives me the following error:
name ActionChains is not defined.
Selenium does have this capability, I'm just not sure if i can add it to Lavague on my end using custom methods.

adeprez · 2024-06-28T07:08:51Z

Currently, the available actions are limited to those implemented in the drivers. We haven't yet considered double-click actions. Could you share your use case with us? I wasn't aware that double-clicks were commonly used in web interfaces.

We can certainly add double-click to the list of available actions. Would you like to contribute to this enhancement? It involves adding the necessary code to the exec_code function from the Selenium Driver, and documenting its usage within the prompt template.

VarunNair31 · 2024-06-28T13:48:30Z

The use case is basically to open a folder in our internal application. I would be happy to contribute but currently i wont be able to takeout any free time from my schedule. I will be sure to take some time and contribute.

dhuynh95 · 2024-07-01T08:33:39Z

Is a double click truly needed versus a simple click?
Is it some kind of legacy web app?
We can provide the work, but we would need to know better what you want to achieve. If you are open, you can ping me on Discord and we can schedule a quick call to understand where you are at and help you

lyie28 · 2024-07-01T19:10:11Z

Just to add details on how to modify the driver source code to add custom actions. You can do so with the following steps:

Open the base.py file for your driver, e.g lavague-integrations/drivers/lavague-drivers-selenium/lavague/drivers/selenium/base.py
Define your action in the list of actions in the SELENIUM_PROMPT_TEMPLATE below the The actions available are: line.

For example, you might add a new clearValue action:

Name: clearValue
Description: Focus on and clear the text of an input element with a specific xpath
Arguments:
  - xpath (string)

Add an elif clause to the exec_code method to handle your new action.

elif action_name == "clearValue":
                self.clear_value(
                    item["action"]["args"]["xpath"]
                )

Finally add your new method for handling this action:

def clear_value(self, xpath: str,):
        elem = self.page.locator(f"xpath={xpath}").first
        elem.clear()

Install your driver from your local package:
pip install -e lavague-integrations/drivers/lavague-drivers-selenium

lyie28 closed this as completed Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to add custom Actions that the agent can perform #394

Ability to add custom Actions that the agent can perform #394

VarunNair31 commented Jun 28, 2024

dhuynh95 commented Jun 28, 2024

VarunNair31 commented Jun 28, 2024 •

edited

Loading

adeprez commented Jun 28, 2024

VarunNair31 commented Jun 28, 2024

dhuynh95 commented Jul 1, 2024

lyie28 commented Jul 1, 2024 •

edited

Loading

Ability to add custom Actions that the agent can perform #394

Ability to add custom Actions that the agent can perform #394

Comments

VarunNair31 commented Jun 28, 2024

dhuynh95 commented Jun 28, 2024

VarunNair31 commented Jun 28, 2024 • edited Loading

adeprez commented Jun 28, 2024

VarunNair31 commented Jun 28, 2024

dhuynh95 commented Jul 1, 2024

lyie28 commented Jul 1, 2024 • edited Loading

VarunNair31 commented Jun 28, 2024 •

edited

Loading

lyie28 commented Jul 1, 2024 •

edited

Loading