2019-01287 - Post-Doctoral Research Visit F/M Postdoc: High Performance Deep Reinforcement Learning
Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD de la fonction publique

Niveau de diplôme exigé : Thèse ou équivalent

Fonction : Post-Doctorant

A propos du centre ou de la direction fonctionnelle

Inria the French national institute for research in computer science and control, is dedicated to fundamental and applied research in information and communication science and technology (ICST). Inria has a workforce of 3,800 people working throughout its eight research centers established in seven regions of France.

Grenoble is the capital city of the French Alpes. Combining the urban life-style of southern France with a unique mountain setting, it is ideally situated for outdoor activities. The Grenoble area is today an important centre of industry and science (second largest in France). Dedicated to an ambitious policy in the arts, the city is host to numerous cultural institutions. With 60,000 students (including 6,000 foreign students), Grenoble is the third largest student area in France.

Mission confiée

Reinforcement learning goal is to self-learn a task trying to maximize a reward (a game score for instance)  interacting with simulations.
 Recently, researchers have successfully introduced deep neural networks enabling to address more complex problems. This is often refered as
 Deep Reinforcement Learning (DRL). DRL managed for instance to play many ATARI games. The most visible success of
 DLR is probably AlphaGo Zero that  outperformed the best human players (and itself) after being trained without using data from human games but solely through reinforcement learning.  The process requires an advanced infrastructure for the training phase. For instance AlphaGo Zero trained during more than 70 hours using 64 GPU workers and19 CPU parameter servers for playing 4.9 million games of generated self-play, using 1,600 simulations for each Monte Carlo Tree Search.
 The general workflow  is the following.  To speed up the learning process and enable a wide but thorough exploration of the parameter space, the learning neural network  interacts  in parallel with several instances of actors, each one consisting of a simulation of the task being learned and  a neural network interacting with this simulation through the best wining strategy it knows. Periodically the actor neural networks are being updated  by the learned neural network.
 This workflow has evolved through various research works combining parallelization, asynchronism  and novel learning strategies (GORILA, A3C, IMPALA,...).

 The goal of this postdoc is to push forward the scalability of these approaches, and to  proposing novel learning strategies to
 learn more rapidly and more complex tasks (multiple heterogeneous tasks at once, non deterministic games, simulations of complex industrial or living systems).
 This work will be performed in close collaboration in between the Sequel INRIA team specialized in DRL (https://team.inria.fr/sequel/)  and the DataMove team specialized in HPC (https://team.inria.fr/datamove) .
 Datamove has developed the Melissa (https://melissa-sa.github.io/)  solution to manage  large ensembles of parallel simulations  and aggregate their data on-line in a parallel server.  Melissa enabled to run thousands of simulation on up to 30 000 cores.  So far Melissa was  used to compute advanced statistics. But  we expect this framework to be a sound  base for a DRL workflow.  The SequeL team has strong activities in reinforcement learning, either deep or not, ranging from theroretical aspects to applications. Among other projects, SequeL has collaborated with Mila (Montréal) to design and develop the Guesswhat?! experiment (https://guesswhat.ai/). As early as 2006, SequeL worked on go and designed the first go program (Crazy Stone) able to challenge a human expert player.

 

References

Principales activités

  • Requirement: PhD in computer Science
  • Location: Grenoble or Lille
  • Hosting Teams:
    • Sequel (INRIA Lille): https://team.inria.fr/sequel/
    • DataMove (INRIA Grenoble): https://team.inria.fr/datamove
  • Contact: Bruno.Raffin@inria.fr and Philippe.Preux@inria.fr
  • Period: to start somewhere in 2019
  • Duration: 24 months


 We are looking for a candidate with a PhD either in deep learning, reinforcement learning or  high performance computing (a combination of these expertise  would be ideal)  for a 24 month contract at INRIA. The candidate will have the possibility to join either the Sequel team at Lille or the Grenoble Team at Grenoble.

 The postdoc  will have access to large  supercomputers equipped with multiple GPUs  for experiments. We expect this work to lead to  international publications  sustained by advanced  software prototypes.

Avantages

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage

Rémunération

Salary: 2 653 € gross/month.

Monthly salary after taxes : around 2 136,39 € (medical insurance included, income tax excluded).