Skip to content

Gym environment for building simulation and control using reinforcement learning

License

Notifications You must be signed in to change notification settings

zephframe/sinergym

 
 

Repository files navigation

Sinergym



Github latest release Github last commit Pypi version Pypi downloads GitHub Contributors Github issues GitHub pull requests Github License Pypi Python version

Welcome to Sinergym!



The goal of this project is to create an environment following Gymnasium interface, for wrapping simulation engines for building control using deep reinforcement learning.

Please, help us to improve by reporting your questions and issues here. It is easy, just 2 clicks using our issue templates (questions, bugs, improvements, etc.). More detailed info on how to report issues here. Don't forget to take a look at CONTRIBUTING.md if you're thinking about contributing to Sinergym.

The main functionalities of Sinergym are the following :

  • Include different simulation engines. Communication between Python and EnergyPlus is established using BCVTB middleware. Since this tool allows for interacting with several simulation engines, more of them (e.g. OpenModelica) could be included in the backend while maintaining the Gymnasium API.

  • Benchmark environments. Similarly to Atari or Mujoco environments for RL community, we are designing a set of environments for benchmarking and testing deep RL algorithms. These environments may include different buildings, weathers, action/observation spaces, function rewards, etc.

  • Customizable environments. We aim to provide a package that allows to modify experimental settings in an easy manner. The user can create his own environments defining his own building model, weather, reward, observation/action space and variables, environment name, etc. The user can also use these pre-configured environments available in Sinergym and change some aspect of it (for example, the weather) in such a way that he does not have to make an entire definition of the environment and can start from one pre-designed by us. Some parameters directly associated with the simulator can be set as extra configuration as well, such as people occupant, time-steps per simulation hour, run-period, etc.

  • Customizable components: Sinergym is easily scalable by third parties. Following the structure of the implemented classes, new custom components can be created for new environments such as function rewards, wrappers, controllers, etc.

  • Automatic Building Model adaptation to user changes: Building models (IDF) will be adapted to specification of each simulation by the user. For example, Designdays and Location components from IDF files will be adapted to weather file (EPW) specified in Sinergym simulator backend without any intervention by the user (only the environment definition). BCVTB middleware external interface in IDF model and variables.cfg file is generated when simulation starts by Sinergym, this definition depends on action and observation space and variables defined. In short, Sinergym automates the whole process of model adaptation so that the user only has to define what he wants for his environment.

  • Automatic external interface integration for actions. Sinergym provides functionality to obtain information about the environments such as the zones or the schedulers available in the environment model. Using that information, which is possible to export in a excel, users can know which controllers are available in the building and, then, control them with an external interface from an agent. To do this, users will make an action definition in which it is indicated which default controllers they want to replace in a specific format and Sinergym will take care of the relevant internal changes in the model.

  • Stable Baseline 3 Integration. Some functionalities like callbacks have been customized by our team in order to test easily these environments with deep reinforcement learning algorithms. This tool can be used with any other DRL library that supports the * Gymnasium* interface as well.

  • Weights & Biases tracking and visualization. One of Sinergym's objectives is to automate and facilitate the training, reproducibility and comparison of agents in simulation-based building control problems, managing and monitoring model lifecycle from training to deployment. WandB is an open-source platform for the machine learning lifecycle helping us with this issue. It lets us register experiments hyperparameters, visualize data recorded in real-time, and store artifacts with experiment outputs and best obtained models.

  • Google Cloud Integration. Whether you have a Google Cloud account and you want to use your infrastructure with Sinergym, we tell you some details about how to do it.

  • Notebooks examples. Sinergym develops code in notebook format with the purpose of offering use cases to the users in order to help them become familiar with the tool. They are constantly updated, along with the updates and improvements of the tool itself.

  • This project is accompanied by extensive documentation, unit tests and github actions workflows to make Sinergym an efficient ecosystem for both understanding and development.

  • Many more!

This is a project in active development. Stay tuned for upcoming releases.



List of available environments

If you would like to see a complete and updated list of our available environments, please visit our list in the official Sinergym documentation.

Installation

Please, visit INSTALL.md for more information about Sinergym installation.

Usage example

If you used our Dockerfile during installation, you should have the try_env.py file in your workspace as soon as you enter in. In case you have installed everything on your local machine directly, place it inside our cloned repository. In any case, we start from the point that you have at your disposal a terminal with the appropriate python version and Sinergym running correctly.

Sinergym uses the standard Gymnasium API. So basic loop should be something like:

import gymnasium as gym
import sinergym
# Create the environment
env = gym.make('Eplus-datacenter-mixed-continuous-stochastic-v1')
# Initialize the episode
obs, info = env.reset()
terminated = False
R = 0.0
while not terminated:
    a = env.action_space.sample() # random action selection
    obs, reward, terminated, truncated, info = env.step(a) # get new observation and reward
    R += reward
print('Total reward for the episode: %.4f' % R)
env.close()

Notice that a folder will be created in the working directory after creating the environment. It will contain the EnergyPlus outputs produced during the simulation.

📝 For more examples and details, please visit our usage examples documentation section.

Google Cloud Platform support

For more information about this functionality, please, visit our documentation here.

Projects using Sinergym

The following are some of the projects benefiting from the advantages of Sinergym:

📝 If you want to appear in this list, do not hesitate to send us a PR and include the following badge in your repository:

Citing Sinergym

If you use Sinergym in your work, please cite our paper:

@inproceedings{2021sinergym,
    title={Sinergym: A Building Simulation and Control Framework for Training Reinforcement Learning Agents}, 
    author={Jiménez-Raboso, Javier and Campoy-Nieves, Alejandro and Manjavacas-Lucas, Antonio and Gómez-Romero, Juan and Molina-Solana, Miguel},
    year={2021},
    isbn = {9781450391146},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3486611.3488729},
    doi = {10.1145/3486611.3488729},
    booktitle = {Proceedings of the 8th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation},
    pages = {319–323},
    numpages = {5},
}

About

Gym environment for building simulation and control using reinforcement learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.2%
  • Dockerfile 0.8%