research-article

Open access

CARL: controllable agent with reinforcement learning for quadruped locomotion

Authors:

Ying-Sheng Luo,

Jonathan Hans Soeseno,

Trista Pei-Chun Chen,

Wei-Chao ChenAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 39, Issue 4

Article No.: 38, Pages 38:1 - 38:10

https://doi.org/10.1145/3386569.3392433

Published: 12 August 2020 Publication History

Abstract

Motion synthesis in a dynamic environment has been a long-standing problem for character animation. Methods using motion capture data tend to scale poorly in complex environments because of their larger capturing and labeling requirement. Physics-based controllers are effective in this regard, albeit less controllable. In this paper, we present CARL, a quadruped agent that can be controlled with high-level directives and react naturally to dynamic environments. Starting with an agent that can imitate individual animation clips, we use Generative Adversarial Networks to adapt high-level controls, such as speed and heading, to action distributions that correspond to the original animations. Further fine-tuning through the deep reinforcement learning enables the agent to recover from unseen external perturbations while producing smooth transitions. It then becomes straightforward to create autonomous agents in dynamic environments by adding navigation modules over the entire process. We evaluate our approach by measuring the agent's ability to follow user control and provide a visual analysis of the generated motion to show its effectiveness.

Supplemental Material

MP4 File

Presentation video

Transcript for: Presentation video

Download
1373.22 MB
Transcript

MP4 File

Download
469.08 MB

References

[1]

Trapit Bansal, Jakub Pachocki, Szymon Sidor, Ilya Sutskever, and Igor Mordatch. 2018. Emergent Complexity via Multi-Agent Competition. In International Conference on Learning Representations.

[2]

Kevin Bergamin, Simon Clavet, Daniel Holden, and James Richard Forbes. 2019. DReCon: Data-Driven Responsive Control of Physics-Based Characters. ACM Transactions on Graphics (TOG) 38, 6 (2019), 206.

Digital Library

[3]

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. arXiv preprint arXiv:1606.01540 (2016).

[4]

Jinxiang Chai and Jessica K Hodgins. 2005. Performance Animation from Low-dimensional Control Signals. ACM Transactions on Graphics (TOG) 24, 3 (2005), 686--696.

Digital Library

[5]

Nuttapong Chentanez, Matthias Müller, Miles Macklin, Viktor Makoviychuk, and Stefan Jeschke. 2018. Physics-Based Motion Capture Imitation with Deep Reinforcement Learning. In Proceedings of the 11th Annual International Conference on Motion, Interaction, and Games. ACM, 1.

Digital Library

[6]

Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8789--8797.

[7]

Stelian Coros, Philippe Beaudoin, and Michiel Van de Panne. 2009. Robust task-based control policies for physics-based characters. In ACM Transactions on Graphics (TOG), Vol. 28. ACM, 170.

Digital Library

[8]

Stelian Coros, Philippe Beaudoin, and Michiel Van de Panne. 2010. Generalized biped walking control. In ACM Transactions on Graphics (TOG), Vol. 29. ACM, 130.

Digital Library

[9]

Stelian Coros, Andrej Karpathy, Ben Jones, Lionel Reveret, and Michiel Van De Panne. 2011. Locomotion skills for simulated quadrupeds. In ACM Transactions on Graphics (TOG), Vol. 30. ACM, 59.

Digital Library

[10]

Erwin Coumans et al. 2013. Bullet physics library. Open source: bulletphysics.org (2013).

[11]

Marco Da Silva, Yeuhi Abe, and Jovan Popović. 2008. Simulation of human motion data using short-horizon model-predictive control. In Computer Graphics Forum, Vol. 27. Wiley Online Library, 371--380.

[12]

Dhaivat Dholakiya, Shounak Bhattacharya, Ajay Gunalan, Abhik Singla, Shalabh Bhatnagar, Bharadwaj Amrutur, Ashitava Ghosal, and Shishir Kolathaya. 2019. Design, development and experimental realization of a quadrupedal research platform: Stoch. In 2019 5th International Conference on Control, Automation and Robotics (ICCAR). IEEE, 229--234.

[13]

Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue, and Adam Roberts. 2019. Gansynth: Adversarial neural audio synthesis. arXiv preprint arXiv:1902.08710 (2019).

[14]

Katerina Fragkiadaki, Sergey Levine, Panna Felsen, and Jitendra Malik. 2015. Recurrent Network Models for Human Dynamics. In Proceedings of the IEEE International Conference on Computer Vision. 4346--4354.

Digital Library

[15]

Justin Fu, Katie Luo, and Sergey Levine. 2018. Learning robust rewards with adversarial inverse reinforcement learning. In International Conference on Learning Representations.

[16]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial nets. In Advances in neural information processing systems. 2672--2680.

[17]

Keith Grochow, Steven L Martin, Aaron Hertzmann, and Zoran Popović. 2004. Style-based inverse kinematics. In ACM transactions on graphics (TOG), Vol. 23. ACM, 522--531.

[18]

Perttu Hämäläinen, Joose Rajamäki, and C Karen Liu. 2015. Online control of simulated humanoids using particle belief propagation. ACM Transactions on Graphics (TOG) 34, 4 (2015), 81.

Digital Library

[19]

Nicolas Heess, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, SM Eslami, Martin Riedmiller, et al. 2017. Emergence of locomotion behaviours in rich environments. arXiv preprint arXiv:1707.02286 (2017).

[20]

Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. In Advances in neural information processing systems. 4565--4573.

[21]

Daniel Holden, Taku Komura, and Jun Saito. 2017. Phase-Functioned Neural Networks for Character Control. ACM Transactions on Graphics (TOG) 36, 4 (2017), 42.

Digital Library

[22]

Daniel Holden, Jun Saito, and Taku Komura. 2016. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics (TOG) 35, 4 (2016), 138.

Digital Library

[23]

Ting-Chieh Huang, Yi-Jheng Huang, and Wen-Chieh Lin. 2013. Real-time Horse Gait Synthesis. Computer Animation and Virtual Worlds 24, 2 (2013), 87--95.

[24]

Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, and Marco Hutter. 2019. Learning agile and dynamic motor skills for legged robots. Science Robotics 4, 26 (2019), eaau5872.

[25]

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image Translation with Conditional Adversarial Networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125--1134.

[26]

Lucas Kovar and Michael Gleicher. 2004. Automated Extraction and Parameterization of Motions in Large Data Sets. In ACM Transactions on Graphics (TOG), Vol. 23. ACM, 559--568.

Digital Library

[27]

Lucas Kovar, Michael Gleicher, and Frédéric Pighin. 2002. Motion Graphs. In Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '02). ACM, 473--482.

[28]

Jehee Lee, Jinxiang Chai, Paul SA Reitsma, Jessica K Hodgins, and Nancy S Pollard. 2002. Interactive control of avatars animated with human motion data. In ACM Transactions on Graphics (TOG), Vol. 21. ACM, 491--500.

[29]

Kyungho Lee, Seyoung Lee, and Jehee Lee. 2018. Interactive character animation by learning multi-objective control. ACM Transactions on Graphics (TOG) 37, 6 (2018), 1--10.

Digital Library

[30]

Yoonsang Lee, Sungeun Kim, and Jehee Lee. 2010. Data-driven biped control. In ACM Transactions on Graphics (TOG), Vol. 29. ACM, 129.

Digital Library

[31]

Sergey Levine and Jovan Popović. 2012. Physically Plausible Simulation for Character Animation. In Proceedings of the 11th ACM SIGGRAPH/Eurographics conference on Computer Animation. Eurographics Association, 221--230.

[32]

Sergey Levine, Jack M Wang, Alexis Haraux, Zoran Popović, and Vladlen Koltun. 2012. Continuous character control with low-dimensional embeddings. ACM Transactions on Graphics (TOG) 31, 4 (2012), 28.

Digital Library

[33]

Libin Liu and Jessica Hodgins. 2017. Learning to schedule control fragments for physics-based characters using deep q-learning. ACM Transactions on Graphics (TOG) 36, 3 (2017), 29.

Digital Library

[34]

Libin Liu, KangKang Yin, and Baining Guo. 2015. Improving Sampling-based Motion Control. In Computer Graphics Forum, Vol. 34. Wiley Online Library, 415--423.

[35]

Libin Liu, KangKang Yin, Michiel van de Panne, Tianjia Shao, and Weiwei Xu. 2010. Sampling-based contact-rich motion control. In ACM Transactions on Graphics (TOG), Vol. 29. ACM, 128.

[36]

Josh Merel, Arun Ahuja, Vu Pham, Saran Tunyasuvunakool, Siqi Liu, Dhruva Tirumala, Nicolas Heess, and Greg Wayne. 2019. Hierarchical Visuomotor Control of Humanoids. In International Conference on Learning Representations.

[37]

Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham, Greg Wayne, Yee Whye Teh, and Nicolas Heess. 2018. Neural probabilistic motor primitives for humanoid control. arXiv preprint arXiv:1811.11711 (2018).

[38]

Federico Lorenzo Moro, Nikos G Tsagarakis, and Darwin G Caldwell. 2012. On the Kinematic Motion Primitives (kMPs) - Theory and Application. Frontiers in neurorobotics 6 (2012), 10.

[39]

Soohwan Park, Hoseok Ryu, Seyoung Lee, Sunmin Lee, and Jehee Lee. 2019. Learning Predict-and-Simulate Policies From Unorganized Human Motion Data. ACM Transactions on Graphics (TOG) 38, 6, Article 205 (2019).

Digital Library

[40]

Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018. DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills. ACM Transactions on Graphics (TOG) 37, 4 (2018), 143.

Digital Library

[41]

Xue Bin Peng, Glen Berseth, and Michiel Van de Panne. 2015. Dynamic terrain traversal skills using reinforcement learning. ACM Transactions on Graphics (TOG) 34, 4 (2015), 80.

Digital Library

[42]

Xue Bin Peng, Glen Berseth, and Michiel Van de Panne. 2016. Terrain-adaptive locomotion skills using deep reinforcement learning. ACM Transactions on Graphics (TOG) 35, 4 (2016), 81.

Digital Library

[43]

Xue Bin Peng, Michael Chang, Grace Zhang, Pieter Abbeel, and Sergey Levine. 2019. MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies. In NeurIPS.

[44]

Alla Safonova and Jessica K Hodgins. 2007. Construction and optimal search of interpolated motion graphs. ACM Transactions on Graphics (TOG) 26, 3 (2007), 106.

Digital Library

[45]

John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015a. Trust region policy optimization. In International conference on machine learning. 1889--1897.

Digital Library

[46]

John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. 2015b. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 (2015).

[47]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).

[48]

Hyun Joon Shin and Jehee Lee. 2006. Motion synthesis and editing in low-dimensional spaces. Computer Animation and Virtual Worlds 17, 3-4 (2006), 219--227.

[49]

Sebastian Starke, He Zhang, Taku Komura, and Jun Saito. 2019. Neural state machine for character-scene interactions. ACM Transactions on Graphics (TOG) 38, 6 (2019), 1--14.

Digital Library

[50]

Richard S Sutton, Andrew G Barto, et al. 1998. Introduction to reinforcement learning. Vol. 2. MIT press Cambridge.

[51]

Yuval Tassa, Tom Erez, and Emanuel Todorov. 2012. Synthesis and stabilization of complex behaviors through online trajectory optimization. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 4906--4913.

[52]

Yuval Tassa, Nicolas Mansard, and Emo Todorov. 2014. Control-limited differential dynamic programming. In 2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1168--1175.

[53]

Emanuel Todorov, Tom Erez, and Yuval Tassa. 2012. Mujoco: A Physics Engine for Model-Based Control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 5026--5033.

[54]

Kevin Wampler and Zoran Popović. 2009. Optimal gait and form for animal locomotion. In ACM Transactions on Graphics (TOG), Vol. 28. ACM, 60.

Digital Library

[55]

Kevin Wampler, Zoran Popović, and Jovan Popović. 2014. Generalizing locomotion style to new animals with inverse optimal regression. ACM Transactions on Graphics (TOG) 33, 4 (2014), 49.

Digital Library

[56]

Jungdam Won and Jehee Lee. 2019. Learning body shape variation in physics-based characters. ACM Transactions on Graphics (TOG) 38, 6 (2019), 1--12.

Digital Library

[57]

Jungdam Won, Jongho Park, Kwanyu Kim, and Jehee Lee. 2017. How to train your dragon: example-guided control of flapping flight. ACM Transactions on Graphics (TOG) 36, 6 (2017), 198.

Digital Library

[58]

Jungdam Won, Jungnam Park, and Jehee Lee. 2018. Aerobatics control of flying creatures via self-regulated learning. ACM Transactions on Graphics (TOG) 37, 6 (2018), 1--10.

Digital Library

[59]

Yuting Ye and C Karen Liu. 2010. Optimal feedback control for character animation using an abstract model. ACM Transactions on Graphics (TOG) 29, 4 (2010), 74.

Digital Library

[60]

KangKang Yin, Kevin Loken, and Michiel Van de Panne. 2007. Simbicon: Simple biped locomotion control. In ACM Transactions on Graphics (TOG), Vol. 26. ACM, 105.

Digital Library

[61]

He Zhang, Sebastian Starke, Taku Komura, and Jun Saito. 2018. Mode-adaptive neural networks for quadruped motion control. ACM Transactions on Graphics (TOG) 37, 4 (2018), 145.

Digital Library

[62]

Yi Zhou, Zimo Li, Shuangjiu Xiao, Chong He, Zeng Huang, and Hao Li. 2018. Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis. (2018).

[63]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.

Cited By

Rohan Singh R Dhanush P V (2024)A Literature Survey on Quadruped AI Assistant: Integrating Image Processing and Natural Language Processing for Emotional IntelligenceInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-15313(70-82)Online publication date: 5-Feb-2024
https://doi.org/10.48175/IJARSCT-15313
Serifi AGrandia RKnoop EGross MBächer MKry PCani MSkouras MWang H(2024)VMP: Versatile Motion Priors for Robustly Tracking Motion on Physical CharactersProceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation10.1111/cgf.15175(1-11)Online publication date: 21-Aug-2024
https://dl.acm.org/doi/10.1111/cgf.15175
Schreiner PNetterstrøm RYin HDarkner SErleben K(2024)ADAPT: AI‐Driven Artefact Purging Technique for IMU Based Motion CaptureComputer Graphics Forum10.1111/cgf.15172Online publication date: 17-Oct-2024
https://doi.org/10.1111/cgf.15172
Show More Cited By

Index Terms

CARL: controllable agent with reinforcement learning for quadruped locomotion
1. Computing methodologies
  1. Computer graphics
    1. Animation
      1. Physical simulation
  2. Machine learning
    1. Learning paradigms
      1. Reinforcement learning

Recommendations

Survey of Deep Learning Paradigms for Speech Processing
Abstract
Over the past decades, a particular focus is given to research on machine learning techniques for speech processing applications. However, in the past few years, research has focused on using deep learning for speech processing applications. This ...
The step space: example-based footprint-driven motion synthesis
CASA' 2010 Special Issue

Especially in a constrained virtual environment, precise control of foot placement during character locomotion is crucial to avoid collisions and to ensure a natural locomotion. In this paper, we present the step space: a novel technique for generating ...
Snap-together motion: assembling run-time animations
SIGGRAPH '08: ACM SIGGRAPH 2008 classes

Many virtual environments and games must be populated with synthetic characters to create the desired experience. These characters must move with sufficient realism, so as not to destroy the visual quality of the experience, yet be responsive, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 39, Issue 4

August 2020

1732 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/3386569

Editor:
Szymon Rusinkiewicz
Princeton University

Issue’s Table of Contents

Copyright © 2020 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2020

Published in TOG Volume 39, Issue 4

Check for updates

Badges

Results Reproduced / v1.1

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

36
Total Citations
View Citations
1,933
Total Downloads

Downloads (Last 12 months)476
Downloads (Last 6 weeks)45

Reflects downloads up to 30 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Rohan Singh R Dhanush P V (2024)A Literature Survey on Quadruped AI Assistant: Integrating Image Processing and Natural Language Processing for Emotional IntelligenceInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-15313(70-82)Online publication date: 5-Feb-2024
https://doi.org/10.48175/IJARSCT-15313
Serifi AGrandia RKnoop EGross MBächer MKry PCani MSkouras MWang H(2024)VMP: Versatile Motion Priors for Robustly Tracking Motion on Physical CharactersProceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation10.1111/cgf.15175(1-11)Online publication date: 21-Aug-2024
https://dl.acm.org/doi/10.1111/cgf.15175
Schreiner PNetterstrøm RYin HDarkner SErleben K(2024)ADAPT: AI‐Driven Artefact Purging Technique for IMU Based Motion CaptureComputer Graphics Forum10.1111/cgf.15172Online publication date: 17-Oct-2024
https://doi.org/10.1111/cgf.15172
Tan WFang XZhang WSong RChen TZheng YLi Y(2024)A Hierarchical Framework for Quadruped Omnidirectional Locomotion Based on Reinforcement LearningIEEE Transactions on Automation Science and Engineering10.1109/TASE.2023.331094521:4(5367-5378)Online publication date: Oct-2024
https://doi.org/10.1109/TASE.2023.3310945
Liu GWong S(2024)Mastering broom‐like tools for object transportation animation using deep reinforcement learningComputer Animation and Virtual Worlds10.1002/cav.225535:3Online publication date: 14-Jun-2024
https://doi.org/10.1002/cav.2255
Yang HYoo T(2023)Development of a Real-Time Quadruped Animal Character Rig SystemJournal of Digital Contents Society10.9728/dcs.2023.24.12.297124:12(2971-2980)Online publication date: 31-Dec-2023
https://doi.org/10.9728/dcs.2023.24.12.2971
Li TWon JCho JHa SRai A(2023)FastMimic: Model-Based Motion Imitation for Agile, Diverse and Generalizable Quadrupedal LocomotionRobotics10.3390/robotics1203009012:3(90)Online publication date: 20-Jun-2023
https://doi.org/10.3390/robotics12030090
Park JPark MLee JWon J(2023)Bidirectional GaitNet: A Bidirectional Prediction Model of Human Gait and Anatomical ConditionsACM SIGGRAPH 2023 Conference Proceedings10.1145/3588432.3591492(1-9)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.1145/3588432.3591492
Sontakke NHa S(2023)Solving Challenging Control Problems via Learning-based Motion Planning and Imitation2023 20th International Conference on Ubiquitous Robots (UR)10.1109/UR57808.2023.10202250(267-274)Online publication date: 25-Jun-2023
https://doi.org/10.1109/UR57808.2023.10202250
Christmann GLuo YSoeseno JChen W(2023)Expanding Versatility of Agile Locomotion through Policy Transitions Using Latent State Representation2023 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA48891.2023.10160776(5134-5140)Online publication date: 29-May-2023
https://doi.org/10.1109/ICRA48891.2023.10160776
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents