PhD Position F/M [Campagne DOC BMI-NF-GRA-2024] Knowledge-based reinforcement learning and knowledge evolution
Contract type : Fixed-term contract
Level of qualifications required : Graduate degree or equivalent
Fonction : PhD Position
About the research centre or Inria department
The Centre Inria de l’Université de Grenoble groups together almost 600 people in 22 research teams and 7 research support departments.
Staff is present on three campuses in Grenoble, in close collaboration with other research and higher education institutions (Université Grenoble Alpes, CNRS, CEA, INRAE, …), but also with key economic players in the area.
The Centre Inria de l’Université Grenoble Alpe is active in the fields of high-performance computing, verification and embedded systems, modeling of the environment at multiple levels, and data science and artificial intelligence. The center is a top-level scientific institute with an extensive network of international collaborations in Europe and the rest of the world.
Context
Doctoral school: MSTII, Université Grenoble Alpes.
Advisor: Jérôme Euzenat (Jerome:Euzenat#inria:fr) and Jérôme David (Jerome:David#univ-grenoble-alpes.fr).
Group: The work will be carried out in the mOeX team common to INRIA & LIG. mOeX is dedicated to study knowledge evolution through adaptation. It gathers researchers which have taken an active part these past 15 years in the development of the semantic web and more specifically ontology matching and data interlinking.
Assignment
Cultural knowledge evolution and multiagent reinforcement learning share some of their prominent features. Putting explicit knowledge at the heart of the reinforcement process may contribute to better explanation and transfer.
Cultural knowledge evolution deals with the evolution of knowledge representation in a group of agents. For that purpose, cooperating agents adapt their knowledge to the situations they are exposed to and the feedback they receive from others. This framework has been considered in the context of evolving natural languages [Steels, 2012]. We have applied it to ontology alignment repair, i.e. the improvement of incorrect alignments [Euzenat, 2017] and ontology evolution [Bourahla et al., 2021]. We have shown that it converges towards successful communication through improving the intrinsic knowledge quality.
Reinforcement learning is a learning mechanism adapting the decision making process for maximising the reward provided by the environment to the actions performed by agents [Sutton and Barto, 1998]. Many multi-agent versions of reinforcement learning have also been proposed depending on the agent attitude (cooperative, competitive) and the task structure (homogeneous, heterogeneous) [Bučoniu et al., 2010].
From an external perspective, the two approaches operate in a similar manner: agents perceive their environment, perform an action, receive reward or punishment, adapt their behaviour in consequence. However, a look into the inner mechanisms reveals important differences: the emphasis on knowledge quality instead of reward maximisation, the lack of probabilistic or even gradual interpretation, and even the absence of explicit choice in action or adaptation. Hence these two knowledge acquisition techniques are close enough to suggest replacing one by the other and different enough to cross-fertilise.
This thesis position aims at further exploring the commonalities and differences between experimental cultural knowledge evolution and reinforcement learning. In particular, its purpose is to study which features of one technique may be fruitful in the context of the other and which may not.
For that purpose, one research direction is the introduction of knowledge-based reinforcement learning. In knowledge-based reinforcement learning, the decision-making process (the choice of the action to be performed) is obtained through accumulated explicit knowledge. Thus the adaptation performed after reward or punishment will have to directly affect this knowledge. This has the advantage that it allows to explain the decisions made by agents. It will also allow for explicit knowledge exchange among them [Leno da Silva et al., 2018].
This promotes a less utilitarian view of knowledge in which the evaluation of the performance of the system has to be disconnected from reward maximisation but to depend on the quality of the acquired knowledge. Of course, these two aspects need to remain related (the acquired knowledge must be relevant to the environment). This separation between knowledge and reward is useful when agents have to change environment or use their knowledge to perform various tasks.
Another use of reinforcement mechanisms relevant to cultural knowledge evolution is related to the motivation for agents to explore unknown knowledge territories [Colas et al., 2019]. By associating an intrinsic reward to the newly acquired knowledge, agents are able to improve the coverage of their knowledge in a way not guided by the environment. Complementing cultural knowledge evolution with exploration motivation, should make agents more active in their understanding of the environment and knowledge acquisition.
These problems may be treated both theoretically and experimentally
This work is part of an ambitious program towards what we call cultural knowledge evolution partly funded by the MIAI Knowledge communication and evolution chair.
References:
[Bourahla et. al., 2021] Yasser Bourahla, Manuel Atencia, Jérôme Euzenat, Knowledge improvement and diversity under interaction-driven adaptation of learned ontologies, Proc. 20th AAMAS, London (UK), pp242-250, 2021 https://moex.inria.fr/files/papers/bourahla2021a.pdf
[Bučoniu et al., 2010] Lucian Bučoniu, Robert Babuška, Bart De Schutter, Multi-agent reinforcement learning: an overview, Chapter 7 of D. Srinivasan and L.C. Jain, eds., Innovations in Multi-Agent Systems and Applications – 1, Springer , Berlin (DE), pp183–221, 2010 http://www.dcsc.tudelft.nl/~bdeschutter/pub/rep/10_003.pdf
[Colas et al., 2019] Cédric Colas, Pierre-Yves Oudeyer, Olivier Sigaud, Pierre Fournier, Mohamed Chetouani, Curious: Intrinsically motivated modular multi-goal reinforcement learning, Proc. 36th ICML, Long Beach (CA US), pp1331–1340, 2019 http://proceedings.mlr.press/v97/colas19a/colas19a.pdf
[Euzenat, 2017] Jérôme Euzenat, Communication-driven ontology alignment repair and expansion, Proc. 26th IJCAI, Melbourne (AU), pp185-191, 2017 https://moex.inria.fr/files/papers/euzenat2017a.pdf
[Leno da Silva et al., 2018] Felipe Leno Da Silva, Matthew Taylor, Anna Helena Reali Costa, Autonomously reusing knowledge in multiagent reinforcement learning, Proc. 27th IJCAI, pp5487-5493, 2018 https://www.ijcai.org/proceedings/2018/0774.pdf
[Steels, 2012] Luc Steels (ed.), Experiments in cultural language evolution, John Benjamins, Amsterdam (NL), 2012
[Sutton and Barto, 1998] Richard Sutton, Andrew Barto, Reinforcement learning: an introduction, The MIT Press, Cambridge (MA US), 1998 (2nd ed. 2018) http://incompleteideas.net/book/RLbook2020.pdf
Links:
- MIAI Knowledge communication and evolution: https://moex.inria.fr/cooperation/miai/
- mOeX web site: https://moex.inria.fr
- Lazy lavender: https://gitlab.inria.fr/moex/lazylav
Main activities
Main activities :
- Analyse the state of the art
- Problem formalization
- Software developpment
- Propose & design experiments
- Write scientific reports & articles
Skills
Researched skills:
- Curiosity and openness.
- Interaction with other researchers.
- Autonomous researcher.
- Taste for experimentation.
- Knowledge of multi-agent simulation and/or reinforcement learning not required but a plus.
- Innovative.
Benefits package
- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking (90 days / year) and flexible organization of working hours (except for intership)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage under conditions
Remuneration
1st and 2nd year: 2 100 euros gross salary /month
3rd year: 2 190 euros gross salary / month
General Information
- Theme/Domain :
Data and Knowledge Representation and Processing
Statistics (Big data) (BAP E) - Town/city : Montbonnot
- Inria Center : Centre Inria de l'Université Grenoble Alpes
- Starting date : 2024-10-01
- Duration of contract : 3 years
- Deadline to apply : 2024-04-30
Warning : you must enter your e-mail address in order to save your application to Inria. Applications must be submitted online on the Inria website. Processing of applications sent from other channels is not guaranteed.
Instruction to apply
CV, cover letter, Master's grades, a letter of recommendation from the Master's course supervisor (or equivalent), possibly a letter of recommendation from the master's supervisor.
Defence Security :
This position is likely to be situated in a restricted area (ZRR), as defined in Decree No. 2011-1425 relating to the protection of national scientific and technical potential (PPST).Authorisation to enter an area is granted by the director of the unit, following a favourable Ministerial decision, as defined in the decree of 3 July 2012 relating to the PPST. An unfavourable Ministerial decision in respect of a position situated in a ZRR would result in the cancellation of the appointment.
Recruitment Policy :
As part of its diversity policy, all Inria positions are accessible to people with disabilities.
Contacts
- Inria Team : MOEX
-
PhD Supervisor :
David Jérôme / jerome.david@inria.fr
About Inria
Inria is the French national research institute dedicated to digital science and technology. It employs 2,600 people. Its 200 agile project teams, generally run jointly with academic partners, include more than 3,500 scientists and engineers working to meet the challenges of digital technology, often at the interface with other disciplines. The Institute also employs numerous talents in over forty different professions. 900 research support staff contribute to the preparation and development of scientific and entrepreneurial projects that have a worldwide impact.