2019-01442 - Post-Doctoral Research Visit F/M Intrinsically motivated multi-goal deep reinforcement learning in open virtual worlds

Contract type : Fixed-term contract

Level of qualifications required : PhD or equivalent

Fonction : Post-Doctoral Research Visit

About the research centre or Inria department

The Flowers team studies computational mechanisms allowing robots and humans to acquire open-ended repertoires of skills through life-long learning. This includes the processes for progressively discovering their bodies and interaction with objects, tools and others. In particular, we study mechanisms of intrinsically motivated learning (also called curiosity-driven active learning), autonomous unsupervised exploration, imitation and social learning, multimodal statistical inference, embodiment and maturation and self-organization.

The team considers cognitive development as a complex dynamical system which needs to be understood through systemic thinking, leveraging tools and concepts from computational sciences (artificial intelligence, machine learning and robotics), neuroscience and psychology. In this perspective, algorithms and robotics models are powerful scientific languages to express theories of cognitive development in the living.

Of particular interest to the Flowers team is the formation of repertoires of sensorimotor and interaction skills as well as their relation with the acquisition and evolution of languages.

The team is also working on applications of this research in three fields: adaptive human-computer interfaces, educational technologies and open-source robotics for art and education.


Supervision: Pierre-Yves Oudeyer, http://www.pyoudeyer.com (Flowers team, Inria and Ensta ParisTech)

Duration: between 18 months and 2 years

Starting date: between april and november 2019

Applications: Send CV + letter of motivation to pierre-yves.oudeyer@inria.fr

Keywords : Deep RL, neural networks, multi-task learning, transfer learning, curiosity, intrinsic motivation, curriculum learning, Unity3D.


This postdoc project aims to develop autonomous lifelong machine learning techniques that enable virtual intelligent agents to make discoveries and acquire large repertoires of skills in open uncertain environments. This is key for developing agents that need to continuously explore and adapt interaction skills to new or changing tasks, environments, people to interact with, and preferences of others. The approach will leverage recent advances in curiosity-driven developmental learning (also called intrinsically motivated learning) to drive exploration in a multi-goal deep reinforcement learning framework. In particular, it will consist in studying several extensions of recent results of the Flowers lab in this area, including unsupervised learning of goal spaces using deep learning approaches (Laversanne-Finot et al., 2018) and the CURIOUS algorithm for intrinsically motivated multi-task multi-goal deep RL (Colas et al., 2019). These algorithms will be evaluated on benchmarks involving novel virtual environments dedicated to study exploration and curiosity (e.g. based on Unity3D MLagents), as well as modern open world video games in the context of a collaboration with Ubisoft (Bordeaux).



Baranes, A., & Oudeyer, P. Y. (2013) Active learning of inverse models with intrinsically motivated goal exploration in robots. Robotics and Autonomous Systems, 61(1), 49-73.

COLAS, Cédric, SIGAUD, Olivier, et OUDEYER, Pierre-Yves (2019) CURIOUS: Intrinsically Motivated Multi-Task, Multi-Goal Reinforcement Learning. https://arxiv.org/pdf/1802.05054.pdf

Colas, C., Sigaud, O., and P-Y. Oudeyer (2018) Gep-pg: Decoupling exploration and exploitation in deep reinforcement learning algorithms. arXiv preprint arXiv:1802.05054, 2018.

Laversanne-Finot, A., Péré, A., Oudeyer, P-Y. (2018) Curiosity Driven Exploration of Learned Disentangled Goal Spaces, In Proceedings of Conference on Robot Learning (CoRL 2018).
Blog post : https://openlab-flowers.inria.fr/t/discovery-of-independently-controllable-features-through-autonomous-goal-setting/494

Péré, A., Forestier, S., Sigaud, O, and P.-Y. Oudeyer (2018) Unsupervised learning of goal spaces for intrinsically motivated goal exploration. In International Conference on Learning Representations (ICLR), 2018.



Main activities

See above.

Benefits package

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage


2653€ / month (before taxs)