University of Twente Student Theses

Login

State representation learning using robotic priors in continuous action spaces for mobile robot navigation

Bijman, A.L. (2020) State representation learning using robotic priors in continuous action spaces for mobile robot navigation.

[img]

PDF
11MB

Abstract:	The recent advance of reinforcement learning algorithms has shown that these algorithms are able to solve complex problems. One of these complex problems is the problem of mobile robot navigation. Mobile robot navigation can be highly complex and diverse. To navigate environments, mobile robots must often rely on generic sensors like cameras or lidar sensors. These sensors provide high dimensional data. Using this high dimensional data directly in endto- end reinforcement learning can be challenging, as this typically requires large amounts of data to learn a task. This is especially problematic in the context of robotics, as acquiring this data can be expensive and time consuming. State representation learning aims to map the high dimensional data fromsensors to a lowdimensional state space to reduce training time and the required data. Various methods of learning a low dimensional state representation have been proposed in literature. One method is to use prior knowledge to learn such a representation. This prior knowledge is encoded using loss functions, called robotic priors, which can be used to train an encoder network, implemented using an artificial neural network. This work is focused on mobile robot navigation, where a robot learns to navigate an environment. Previous work from (24) and (11) that has used robotic priors to learn a state representation for the purpose of mobile robot navigation has used discrete actions. This work expands this into a continuous action setting. The work is done using the Gazebo simulator, in which confined environments are build. A differential drive mobile robot is used for the navigation task. The mobile robot has a camera and a 360-degree lidar sensor to observe its environment. The Gazebo simulator is coupled with python code using ROS middleware. The state representation learning algorithm and the reinforcement learning algorithmare implemented in python using tensorflow. The used reinforcement learning algorithmis DDPG, as this can work with continuous actions and is sample efficient. To learn a state representation using robotic priors in a continuous action setting, adaptations to these robotic priors needed to be made. This work has shown two ways these robotic priors can be adapted to work with continuous actions. It was shown that these priors can be used to learn a state representation that allows the DDPG algorithm to navigate various environments successfully. Furthermore, the adapted priors are easier to implement compared to the priors introduced in literature and did not require extensive tuning. It is shown that the robotic priors can be used to learn a state representation across a range of simulation environments. To extend the generality of the state representation learning using robotic priors framework, learning the state representation using robotic priors was extended to learn a recurrent state representation. Environments can be non-markovian, i.e. not all observations can be uniquely mapped to the state of the environment. To make the problem markovian, such that it can be solved using reinforcement learning, memory can be added. The work of (36) has introduced recurrent state representation learning using robotic priors. To train the encoder network, this work has relied on ground truth data. In this work it was shown that the previously used priors are sufficient to learn a recurrent state representation.
Item Type:	Essay (Master)
Faculty:	EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:	53 electrotechnology
Programme:	Electrical Engineering MSc (60353)
Link to this item:	https://purl.utwente.nl/essays/89172
Export this item as:	BibTeX EndNote HTML Citation Reference Manager

Show download statistics for this publication

Repository Staff Only: item control page