Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom reward RL #15

Closed
lambdavi opened this issue Feb 28, 2024 · 4 comments
Closed

Custom reward RL #15

lambdavi opened this issue Feb 28, 2024 · 4 comments

Comments

@lambdavi
Copy link

Hello.
Great work done with the simulator.
However, I am encountering a problem in using the environment with custom reward function.
I installed loco_mujoco through pip.

from stable_baselines3 import PPO, DDPG, SAC
from stable_baselines3.common.env_util import make_vec_env
import numpy as np
from loco_mujoco import LocoEnv
import gymnasium as gym
import torch

# define what ever reward function you want
def my_reward_function(state, action, next_state):
    return -np.mean(action)     # here we just return the negative mean of the action

def make_env():
    return gym.make("LocoMujoco", env_name="UnitreeA1.simple", reward_type="custom",
               reward_params=dict(reward_callback=my_reward_function))

Following your documentation I encounter this error:

TypeError: loco_mujoco.environments.quadrupeds.unitreeA1.UnitreeA1() got multiple values for keyword argument 'reward_type' was raised from the environment creator for LocoMujoco with kwargs ({'env_name': 'UnitreeA1.simple', 'reward_type': 'custom', 'reward_params': {'reward_callback': <function my_reward_function at 0x104e27d90>}})

It looks like I can't override the reward function.
I don't know if you have any suggestions in this sense or if you encountered this problem as well and you know an easy fix.

@robfiras
Copy link
Owner

Hi @lambdavi,
there is a known issue with the Unitree A1 environment not allowing custom reward functions yet.
I will add this feature today, and push it to the master branch. I would ask you to switch to an editable installation until the next release, which is going to happen soon! Sorry for the inconvenience, and thanks for reporting this!
I will notify you once added.

@lambdavi
Copy link
Author

You probably know it already, but that same behaviour is extended to other envs, such as Unitree H1.
I will switch to the editable version. I am going to check out better the code for the simulation, if you need help in developing some not critical feature, just let me know.

Thanks for the prompt reply and help!

@robfiras
Copy link
Owner

Yeah, with the latest release, there was a problem with the custom rewards for other robots as well. But that was fixed already, it is just not released yet. The only one missing is the Unitree A1, which requires a few more changes.
Thanks for being willing to help out!

@robfiras
Copy link
Owner

Sorry for the delay, this has been fixed now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants