Journal of Bionic Engineering ›› 2024, Vol. 21 ›› Issue (4): 1720-1732.doi: 10.1007/s42235-024-00517-3

• • 上一篇    下一篇

Economical Quadrupedal Multi-Gait Locomotion via Gait-Heuristic Reinforcement Learning

Lang Wei1 ; Jinzhou Zou1 ; Xi Yu1 ; Liangyu Liu1 ; Jianbin Liao1 ; Wei Wang1 ; Tong Zhang2   

  1. 1 School of Power and Mechanical Engineering, Wuhan University, Luojiashan, Wuhan 430072, Hubei, China  2 State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Luojiashan, Wuhan 430072, Hubei, China
  • 出版日期:2024-07-15 发布日期:2024-09-01
  • 通讯作者: Wei Wang E-mail:whuww@whu.edu.cn
  • 作者简介:Lang Wei1 ; Jinzhou Zou1 ; Xi Yu1 ; Liangyu Liu1 ; Jianbin Liao1 ; Wei Wang1 ; Tong Zhang2

Economical Quadrupedal Multi-Gait Locomotion via Gait-Heuristic Reinforcement Learning

Lang Wei1 ; Jinzhou Zou1 ; Xi Yu1 ; Liangyu Liu1 ; Jianbin Liao1 ; Wei Wang1 ; Tong Zhang2   

  1. 1 School of Power and Mechanical Engineering, Wuhan University, Luojiashan, Wuhan 430072, Hubei, China  2 State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Luojiashan, Wuhan 430072, Hubei, China
  • Online:2024-07-15 Published:2024-09-01
  • Contact: Wei Wang E-mail:whuww@whu.edu.cn
  • About author:Lang Wei1 ; Jinzhou Zou1 ; Xi Yu1 ; Liangyu Liu1 ; Jianbin Liao1 ; Wei Wang1 ; Tong Zhang2

摘要: In order to strike a balance between achieving desired velocities and minimizing energy consumption, legged animals have the ability to adopt the appropriate gait pattern and seamlessly transition to another if needed. This ability makes them more versatile and efcient when traversing natural terrains, and more suitable for long treks. In the same way, it is meaningful and important for quadruped robots to master this ability. To achieve this goal, we propose an efective gait-heuristic reinforcement learning framework in which multiple gait locomotion and smooth gait transitions automatically emerge to reach target velocities while minimizing energy consumption. We incorporate a novel trajectory generator with explicit gait information as a memory mechanism into the deep reinforcement learning framework. This allows the quadruped robot to adopt reliable and distinct gait patterns while benefting from a warm start provided by the trajectory generator. Furthermore, we investigate the key factors contributing to the emergence of multiple gait locomotion. We tested our framework on a closedchain quadruped robot and demonstrated that the robot can change its gait patterns, such as standing, walking, and trotting, to adopt the most energy-efcient gait at a given speed. Lastly, we deploy our learned controller to a quadruped robot and demonstrate the energy efciency and robustness of our method.

关键词: Legged robots , · Deep reinforcement learning , · Central pattern generator , · Quadrupedal gait

Abstract: In order to strike a balance between achieving desired velocities and minimizing energy consumption, legged animals have the ability to adopt the appropriate gait pattern and seamlessly transition to another if needed. This ability makes them more versatile and efcient when traversing natural terrains, and more suitable for long treks. In the same way, it is meaningful and important for quadruped robots to master this ability. To achieve this goal, we propose an efective gait-heuristic reinforcement learning framework in which multiple gait locomotion and smooth gait transitions automatically emerge to reach target velocities while minimizing energy consumption. We incorporate a novel trajectory generator with explicit gait information as a memory mechanism into the deep reinforcement learning framework. This allows the quadruped robot to adopt reliable and distinct gait patterns while benefting from a warm start provided by the trajectory generator. Furthermore, we investigate the key factors contributing to the emergence of multiple gait locomotion. We tested our framework on a closedchain quadruped robot and demonstrated that the robot can change its gait patterns, such as standing, walking, and trotting, to adopt the most energy-efcient gait at a given speed. Lastly, we deploy our learned controller to a quadruped robot and demonstrate the energy efciency and robustness of our method.

Key words: Legged robots , · Deep reinforcement learning , · Central pattern generator , · Quadrupedal gait