Automatic Controller Design via Gaussian Processes

Traditional optimal control schemes, such as LQR, MPC, relies on an accurate model of the underlying system. Modeling accuracy, therefore, directly impacts controller success and performance. However, often it is hard to capture the global dynamics with a high accuracy. To overcome this problem, we develop an active learning framework based on Bayesian optimization that can automate the process of controller design for a specific task even in the absence of dynamics model based on the performance observed in experiments on the physical system. We aim for information-efficient approaches, where only few experiments are needed to obtain improved performance. For this purpose, we learn a local system model that achieves the best performance for a given task, as opposed to learning a global model or learning control policy directly. This is because a good local model in conjunction with well analyzed optimal control schemes can be used to design a controller much more efficiently [1], [2]. A flow diagram of our framework is shown in Figure 1 below:

Figure 1: An active learning framework for automatically designing a task-specific controller.

We start with a prior belief of local (linearized) dynamics, which can also be chosen randomly if no prior information is known about the system. These local dynamics, together with the performance specifications, are used to design a controller. The performance of the controller is evaluated by using it in close-loop operation on the actual (unknown) physical plant. BO algorithm uses this information to iteratively update the local dynamics model to improve the performance.

References

. [1] Roberto Calandra, Nakul Gopalan, Andre ́ Seyfarth, Jan Peters, and Marc Peter Deisenroth. Bayesian gait optimization for bipedal locomotion. In International Conference on Learning and Intelligent Optimization, pages 274–290. Springer, 2014.  
. [2] Alonso Marco, Philipp Hennig, Jeannette Bohg, Stefan Schaal, and Sebastian Trimpe. Automatic LQR tuning based on gaussian process global optimization. arXiv preprint arXiv:1605.01950, 2016.

Faculty:

Claire Tomlin

Student:

Somil Bansal