2020-02765 - Post-Doctoral Research Visit F/M Bio-inspired Reinforcement Learning for Problem Solving
Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD

Niveau de diplôme exigé : Thèse ou équivalent

Fonction : Post-Doctorant

A propos du centre ou de la direction fonctionnelle

The Inria Bordeaux Sud-Ouest centre is one of Inria's eight centres and has around twenty research teams. The Inria centre is a major and recognized player in the field of digital sciences. It is at the heart of a rich R&D and innovation ecosystem: highly innovative SMEs, large industrial groups, competitiveness clusters, research and higher education players, laboratories of excellence, technological research institute...

Contexte et atouts du poste

Within the framework of a partnership (you can choose between)

  • Associate team Meng Po, between the Mnemosyne Inria Project team and Brainnetome Center, Chinese Academy of Science, Beijing


Mission confiée

Scientific research context:
Reinforcement learning (RL) has greatly advanced Artificial Intelligence in recent years, especially
in association with deep networks (cf DeepRL, Gershman et al., 2020). However, despite impressive
performances for some problems, these approaches still have major weaknesses. The size of the
required learning corpuses (and thus the training time) remains prohibitively large (several orders of
magnitude larger than that of humans) and these systems lack flexibility when the rules governing
the environment change. They require long re-training to adapt to new conditions when humans can
move easily and quickly from one context to another. Several solutions have recently been proposed
to overcome these limitations (Botvinick et al., 2019): Concerning excessively long learning,
Episodic Reinforcement Learning relies on strong biological and cognitive foundations (Stachenfeld
et al., 2017) to explain our ability to recall and replay past episodes to accelerate the training. Defining
the optimal strategy of recall remains an important topic of research today (Mattar et al., 2018).
Concerning the lack of flexibility, Meta Reinforcement Learning (Wang et al., 2018) considers our
capability of “learning to learn”, by associating non classical contextual information with the
selection of specific behavior. The global strategy for selecting these associations or creating new
ones is still to be determined (Domenech et al., 2015).
These mechanisms are not independent but intimately associated within our cognitive architecture
and more specifically decision making and executive control. It is consequently a sound idea to
consider them together and select an adapted experimental framework to implement and bind them.
Problem solving is a complex domain, rarely considered in Reinforcement Learning, whereas it
associates several characteristics of high interest here. Particularly, solving a problem is generally
permitted by the association of a priori knowledge and trial and error experimentation. It also
promotes a flexible behavior mixing goal driven strategies and stimulus driven reactions. Interestingly
enough, it is also possible to observe there such cognitive features as creativity and meta-cognition,
which are also of high interest here.

Principales activités

Work description:

The work will consist is studying associations between episodic and meta
reinforcement learning, informed by neurobiological and cognitive principles and adapting them to
the domain of problem solving. After a thorough review of existing algorithms, extensions of episodic
models will be considered, particularly related to strategies of prioritization of replay (Mattar et al.,
2018). Concerning metaRL, the formalism of Task Set (Domenech et al., 2015) will be considered,
together with its adaptation to episodic learning. Other paths of inspiration might be considered.
Another major part of the work will be related to the display of the framework of problem solving,
inspired from experimental behavioral sciences. The use of serious games associated to video games
and/or unplugged activities like developed in educational sciences could be considered. This work
will be carried out in a team with a main background in bio-inspired Machine Learning, collaborating
at the international level with other major labs in this domain. The team is located onto the Bordeaux
Neurocampus, with direct access to neuroscience and medical environment. It also has relations with
laboratories in cognitive science and educational science, for problem solving aspects.


Technical skills and level required : a good knowledge in Reinforcement Learning; a good basis in Artificial Intelligence

Languages : English fluent

Relational skills : interested in multi-disciplinary contacts

Other valued appreciated : interested in neuroscience and cognitive science


  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage


Fix-term contract

gross monthly salary (before taxes and social security charges): 2653,00 €