CLC number:
On-line Access: 2023-06-22
Received: 2023-03-19
Revision Accepted: 2023-06-12
Crosschecked: 0000-00-00
Cited: 0
Clicked: 617
Yecheng SHAO, Yongbin JIN, Zhilong HUANG, Hongtao WANG, Wei YANG. A learning-based control pipeline for generic motor skills for quadruped robots[J]. Journal of Zhejiang University Science A,in press.Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/jzus.A2300128
@article{title="A learning-based control pipeline for generic motor skills for quadruped robots",
author="Yecheng SHAO, Yongbin JIN, Zhilong HUANG, Hongtao WANG, Wei YANG",
journal="Journal of Zhejiang University Science A",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/jzus.A2300128"
}
%0 Journal Article
%T A learning-based control pipeline for generic motor skills for quadruped robots
%A Yecheng SHAO
%A Yongbin JIN
%A Zhilong HUANG
%A Hongtao WANG
%A Wei YANG
%J Journal of Zhejiang University SCIENCE A
%P
%@ 1673-565X
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/jzus.A2300128"
TY - JOUR
T1 - A learning-based control pipeline for generic motor skills for quadruped robots
A1 - Yecheng SHAO
A1 - Yongbin JIN
A1 - Zhilong HUANG
A1 - Hongtao WANG
A1 - Wei YANG
J0 - Journal of Zhejiang University Science A
SP -
EP -
%@ 1673-565X
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/jzus.A2300128"
Abstract: Performing diverse motor skills with a universal controller has been a longstanding challenge for legged robots. While motion imitation-based reinforcement learning (RL) has shown remarkable performance in reproducing designed motor skills, the trained controller is only suitable for one specific type of motion. Motion synthesis has been well developed to generate a variety of different motions for character animation, but those motions only contain kinematic information and cannot be used for control. In this work, we introduce a control pipeline combining motion synthesis and motion imitation-based RL for generic motor skills. We design an animation state machine to synthesize motion from various sources and feed the generated kinematic reference trajectory to the RL controller as part of the input. With the proposed method, we show that a single policy is able to learn various motor skills simultaneously. Further, we notice the ability of the policy to uncover the correlations lurking behind the reference motions to improve control performance. We analyze this ability based on the predictability of the reference trajectory and the quantified measurements can be used to optimize the design of the controller. To demonstrate the effectiveness of our method, we deploy the trained policy on hardware and, with a single control policy, the quadruped robot can perform various learned skills, including automatic gait transitions, high kick, and forward jump.
Open peer comments: Debate/Discuss/Question/Opinion
<1>