Reinforcement Learning from Human Feedback

Apr 27, 2016

PDF Follow

The deployment of versatile robot systems in diverse environments requires intuitive approaches for humans to flexibly teach them new skills. In our present work, we investigate different user feedback types to teach a real robot a new movement skill. Setup for creating the data set

We compare feedback as star ratings on an absolute scale for single roll-outs versus preference-based feedback for pairwise comparisons with respective optimization algorithms (i.e., a variation of co-variance matrix adaptation -evolution strategy (CMA-ES) and random optimization) to teach the robot the game of skill cup-and-ball. In an experimental investigation with users, we investigated the influence of the feedback type on the user experience of interacting with the different interfaces and the performance of the learning systems. Setup for creating the data set

While there is no significant difference for the subjective user experience between the conditions, there is a significant difference in learning performance. The preference-based system learned the task quicker, but this did not influence the users’ evaluation of it. In a follow-up study, we confirmed that the difference in learning performance indeed can be attributed to the human users’ performance.

HRI SAR

Data Scientist for ML and AI

My research interests include social robotics, personalization and adaptation in HCI/HRI, Rehabilitation Robotics, Cognitive Computing, Social Psychology/Neuroscience.

Reinforcement Learning from Human Feedback

Data Scientist for ML and AI

Related