Curriculum-organized Reinforcement Learning for Robotic Dual-arm Assembly
Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review
Contributors
Abstract
Modern manufacturing heavily relies on robotic systems, yet collaborative assembly executed by two or more
robot arms presents several challenges. Dealing with tight manufacturing tolerances demands a flexible and efficient
methodology, empowering robots to handle tasks requiring precision and fine motor skills. In this work, as an example of this
we formulate a peg-in-hole task involving two Franka Emika Panda robots. For these we employ a hierarchical control
architecture. Our approach involves planning feedback-based trajectories for the robots using a reinforcement learning agent,
which are transmitted to low-level impedance controllers on each robot. To facilitate the learning process, we structure a
training procedure as a reverse curriculum, incorporating domain randomization. Upon completion of the training, we evaluate
inferences from the controlled system within a simulated environment. Our analysis delves into the emerging process
characteristics resulting from the curriculum parameters. Episode length and minimum reward threshold imply process time
and its variance, which have been subject to great uncertainties in previous work.
robot arms presents several challenges. Dealing with tight manufacturing tolerances demands a flexible and efficient
methodology, empowering robots to handle tasks requiring precision and fine motor skills. In this work, as an example of this
we formulate a peg-in-hole task involving two Franka Emika Panda robots. For these we employ a hierarchical control
architecture. Our approach involves planning feedback-based trajectories for the robots using a reinforcement learning agent,
which are transmitted to low-level impedance controllers on each robot. To facilitate the learning process, we structure a
training procedure as a reverse curriculum, incorporating domain randomization. Upon completion of the training, we evaluate
inferences from the controlled system within a simulated environment. Our analysis delves into the emerging process
characteristics resulting from the curriculum parameters. Episode length and minimum reward threshold imply process time
and its variance, which have been subject to great uncertainties in previous work.
Details
Original language | English |
---|---|
Title of host publication | Proceedings of the 4th IFSA Winter Conference on Automation, Robotics and Communications for Industry 4.0/5.0 (ARCI) |
Editors | Sergey Y. Yurish |
Publisher | IFSA Publishing, S. L. |
Pages | 8-14 |
ISBN (electronic) | 978-84-09-58219-8 |
Publication status | Published - 12 Feb 2024 |
Peer-reviewed | Yes |