Skip to content
This repository has been archived by the owner on Apr 25, 2023. It is now read-only.

Would it make sense to restrict the action to what's possible? #23

Open
shamoons opened this issue Apr 4, 2019 · 2 comments
Open

Would it make sense to restrict the action to what's possible? #23

shamoons opened this issue Apr 4, 2019 · 2 comments

Comments

@shamoons
Copy link

shamoons commented Apr 4, 2019

If the cartpole is already all the way at the right, we can't really select that action. So would it make sense to disallow that from either the random case (by sampling again) or the network case (by choosing the next highest Q value that the network predicts)?

@daysofthunder98
Copy link

The episode itself terminates if the cartpole deviates from either side by more than 15 degrees, so the experience is recorded and (hopefully) the agent learns from it.

@WorksWellWithOthers
Copy link

The action shouldn't be restricted since it's the goal of the agent to learn what action to take for the most reward.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants