An improvement over OpenAI gym Car-Racing-v0
This repo has improvements on the complexity for CarRacing-v0, which are focus torwards making Car-Racing
env solvable, it s also intended to make the env complex enought to make it ideal to try new more complex tasks and RL problems.
Clone this repo and pip install -e ."[Box2D]"
it is likely that you will need to install pyglet 1.3.2, use pip install pyglet==v1.3.2
for that.
The new environment has some improvements regarding the complexity of the map, the action space and the observation space, let's start with the most useful changes
The action space can be:
- Continous (the original one), with min of
[-1,0,0]
and max of[+1,+1,+1]
. - Continous allowing car going backwards (reverse), from
[-1,-1,0]
to[+1,+1,+1]
. - Discrete, the action space is
[TURN_LEFT, TURN_RIGHT, ACCELERATE, BREAK, DO_NOTHING]
The observation space, i.e. the state (the raw pixels of the game), can be:
- RGB (the original observation space) size
96x96x3
- Grayscale, size
96x96
- RGB with several contiguous frames, size
NUM_FRAMESx96x96x3
orNUM_FRAMESx96x96
for grayscale - Include or not the information panel containing representations of the speed, direction, etc., size the same size for each case.
some screenshots of some of the possible states (in real size i.e. 96x96
):
The map of the track now is configurable but still completely random, now the map can be configurable with
- Different type of road intersections and junctions, value in
{1,2}
,1
for normal original track (faster to compute and create but without any interesection).2
for track much more complex with different intersections such as t-junctions and x-intersections. Take into account that when the track is more complex, it is also slower. some images:
- Different numbers of lanes, value in
{1,2}
in bot case the total width of track will be the same in general.1
for only one lane,2
to incorporate two different lanes in order to keep track of which lane the car is and use that in the reward function, e.g. to train an agent to keep lane, or to take over, etc. Some screenshots:
- Obstacles added, the value is the number of obstacles in the track starting from
0
(i.e. no obstacles). Steping on an obstacle is-10
in the reward function. The value-10
is hard-coded, but it is easy to modify in the file. The size and position of the obstacle are random, probably in next versions they will be configurable
-
Meta information about the track. It is easy to get the positions where t-junctions and intersections (a.k.a. interesting points) are, or get positions close to those interesting points and facing them, also possible to get positions or close positions to obstacles. It is also possible to get possitions close to interesting points controlling how close one wants it to be to the interesting point, e.g.
0
would mean the position is0
tiles away of the interesting point, i.e. the position is the interesting point. In general it is possible to know- Where are the intersections
- Where are the T-junctions
- Where are the X-junctions
- The angle of the curve
- Positions near intersections, curves, or obstacles
-
Easy to modify the reward function
-
On initialization of the environment, a custom reward function can be passed
-
Configure how long the car can be outside track before restarting env
-
Now the reaward function takes into account:
- Obstacles
- The lane in which the car is.
-
Control where to position the car at any time of time. There are functions to easily place the car in random positions close to obstacles, inside or outside the track and in next versions the angle in which the car is facing (towards the track or not, or how much towards the car) will be configurable as well.
-
Control the speed of the car when placing it in a position. This function is intended to traing the agent in interesting positions and different speed to mimic when the car goes out of track or losses control.
-
Function to apply (random) force to the car, to simulate noise or malfunctioning, To be implemented in coming versions.
-
Car can go in reverse, break is
SPACE
-
During the game, calling
python gym/envs/box2d/car_racing.py
use the following keys to do certain stuff -
D
: To enter python debug mode -
Z
: Zoom in or out -
R
: Get new trackreset
-
Q
: quit, close game
and in general de controls are (while playing)
key | Action |
---|---|
D |
Set trace (debug) |
R |
Reset the map |
Z |
Zoom in/out |
G |
Shows the intersection groups |
I |
Shows intersection points |
X |
Shows X junctions |
E |
Shows end of track |
S |
Shows start of track |
T |
Take screenshot |
Q |
Exit |
key_press_function
andkey_release_fn
: It is a initialization parameter, it is very useful to, for example change a variable during executing. The most useful case is start rendering the environemt while training, to have visual feedback of how the training is going
This list is far from comprehensive. This are some improvements of the environment, this allows configures each experiments depending on the objective
- Set the car in certain position in the map:
place_agent (position)
: this function will place the car inposition
- Set intial speed of agent:
set_speed(speed)
: This function will set the initial of the car - Ger random position in the track:
get_rnd_point_in_track(border)
: returns a random point in the track with the angle equal to the tile of the track, the x position can be randomly in the x (relative) axis of the tile, border=True make sure the x position is enough to make the car fit in the track, otherwise the point can be in the extreme of the track and two wheels will be outside the track Returns: [beta, x, y]. - Get screenshot of the space in the current step:
screenshot(dest,namefile)
Name | params | Description |
---|---|---|
play |
env,discretize |
Opens a windows in which you can play this game |
set_speed |
speed |
Set the speed of the car |
place_agent |
position |
Set the agent in position |
screenshot |
dest, name |
Saves the current state/observations of the agent in dest with name name |
get_rnd_point_in_track |
border |
Gets a random position inside the track (touching or not the border) |
get_position_near_junction |
type_junction, tiles_before |
Returns a position tiles_before tiles before the type_junction junction |
get_position_outside |
distance |
Returns a position outside the track distance meters far from track |
set_press_fn |
fn | Will call this function everytime a key is press in the window of the game. Do not use it while using play |
set_release_fn |
fn | Will call this function everytime a key is release in the window of the game. Do not use it while using play |
-
Add controlls to deactivate reverse -
End episode aftermax_time_out
seconds out of track/without moving -
Implement black and white output: usegrayscale
opt in config -
Add small screenshot of what the car is seeing as a function: usescreenshot
func -
Add option to remove bottom panel from state, useshow_info_panel
in config -
Implement number of frames to return as observations (state), useframes_per_state
in config -
Hard Discretise action space -
Clip reward per episode, usemax/min_episode_reward
-
Implement the reward efficiently -
Detect change of line -
Set speed of car -
Set position of car -
Add small distance to d to joing tracks, bigger than x and greater than 50 -
Avoid change lanes after change lanes -
Do NOT compare first tile of main track with last tile ofself.track
-
Get rnd position in track -
Get correct intersections (all true positive) -
Differentiate intersections -
Add obstacles -
Get outside position (useget_position_outside(d)
) -
Fix sizes to use with cnn and different configs for states -
Get tile with certain conditions:-
With certain angle (e.g. >40º <- that is possible usinginfo['angle']
but it is not recommended, use get_rnd_tile_by_type_angle) -
In an T-junction -
In an X-junction -
With obstacle in front
-
-
Apply (random) force to the car (Not yet implemented)
-
Randomize the direction (following the track direction or backwards) in get positions functions -
Differentiate y from t junctions -
Add road lines graphically (although it is not tested with varying lanes) -
Fix joints of roads in some weird cases (most cases covered)
for more information about the environment see the original readme file of the original repo or the code, most functions have descriptions and explanaitions.