diff --git a/_publications/2020data.md b/_publications/2020data.md new file mode 100644 index 0000000..1f9ebca --- /dev/null +++ b/_publications/2020data.md @@ -0,0 +1,16 @@ +--- +title: "Data Generation Method for Learning a Low-dimensional Safe Region in Safe Reinforcement Learning" +collection: publications +permalink: /publication/2020data +authorlist: 'Zhehua Zhou, Ozgur S Oguz, Yi Ren, Marion Leibold, Martin Buss' +excerpt: 'In this work, we investigate how different training data will affect the safe reinforcement learning approach based on data-driven model order reduction.' +date: 2021-09-10 +venue: 'ArXiv Preprint:2109.05077' +# slidesurl: 'http://academicpages.github.io/files/slides1.pdf' +paperurl: 'https://arxiv.org/abs/2109.05077' +# citation: 'Your Name, You. (2009). "Paper Title Number 1." Journal 1. 1(1).' +pubtype: 'preprint' +highlight: 'no' +--- + +Safe reinforcement learning aims to learn a control policy while ensuring that neither the system nor the environment gets damaged during the learning process. For implementing safe reinforcement learning on highly nonlinear and high-dimensional dynamical systems, one possible approach is to find a low-dimensional safe region via data-driven feature extraction methods, which provides safety estimates to the learning algorithm. As the reliability of the learned safety estimates is data-dependent, we investigate in this work how different training data will affect the safe reinforcement learning approach. By balancing between the learning performance and the risk of being unsafe, a data generation method that combines two sampling methods is proposed to generate representative training data. The performance of the method is demonstrated with a three-link inverted pendulum example. \ No newline at end of file