PhD Position F/M Study of the estimation and control principle for Markov decision processes (IDP 2024)
Type de contrat : CDD
Niveau de diplôme exigé : Bac + 5 ou équivalent
Fonction : Doctorant
Contexte et atouts du poste
The Inria team Astral is a joint Inria-Naval Group project team, Naval Group being a French industrial group specializing in naval defense construction. With this thesis, we aim to carry out exploratory and preparatory theoretical studies that could have an impact on the work carried out with Naval Group, without however having any guarantee of direct applications in the short or medium term.
Mission confiée
Markov decision processes are non-diffusive stochastic processes whose defining parameters (jump rates, transition measures, flows) have a variable on which one can act in such a way that it is hoped to be able to control the process to achieve a certain goal. In practice, these processes may depend on parameters that are a priori unknown and whose value one may want to estimate. If one also seeks to control the process in real-time, this estimation must then also be done in real-time, and our decision-making must adapt to the current estimate of the parameters, and the principle of estimation and control comes into play, linking the choice of estimators to that of strategies, and vice versa.
The principle of adaptation and control for discrete-time Markov decision processes has been the subject of numerous studies. These studies show that the class of minimum contrast estimators constitutes a class of estimators allowing the estimation of the parameters of the observed process at the same time as its control via the construction of asymptotically optimal policies, at least for the criterion of total reward with discount factor (and also when the time horizon is finite).
Depending on the candidate's profile, theoretical or practical aspects should be developed in this area of research.
Principales activités
Here are a few lines of theoretical research that could be studied:
- the assumptions made in [H12] about the characteristics of the process need to be weakened to cover more numerous situations.
- the asymptotic properties of the proposed estimators need to be studied more in depth.
- the central limit theorem has not been obtained for these estimators and thus deserved to be studied.
- the work initiated in [Maigret79] around the large deviations principle deserves to be explored further and extended to the context of Markov decision processes.
In this vein, we have recently obtained results that extend the principle of estimation and control to the framework of continuous-time Markov decision processes, see [CG23,CDG23]. The research program presented in discrete time is of course also to be developed in this technically more demanding context, which is notably due to the presence of forced jumps at the boundary.
During this thesis, the practical aspect should not be neglected: the numerical implementation of the studied estimators and the obtained optimal policies will allow the illustration of their properties. This is an important point that will demonstrate the usefulness of theoretical studies and developed methods. In this context, one can look at various classic problems related to target tracking, as explained in [Zhang17], which can be modelled using Markov decision processes with adaptation.
[CG23] Costa, O., \& Dufour, F. (2023). Adaptive discounted control for piecewise deterministic Markov processes. Journal of Mathematical Analysis and Applications, 127517.
[CDG23] Costa, O., Dufour, F. \& Génadot, A. (2023). Minimum Contrast Estimators for Piecewise Deterministic Markov Processes. Soumis.
[Maigret79] Maigret, N. (1979). Majorations de Chernoff et statistique séquentielle pour des chaînes de Markov récurrentes au sens de Doeblin. Astérisque, 68, 125-142.
[H12] Hernández-Lerma, O. (2012). Adaptive Markov control processes (Vol. 79). Springer Science $\&$ Business Media.
[Zhang17] Zhang, H., Dufour, F., Anselmi, J., Laneuville, D., \& Nègre, A. (2017). Piecewise optimal trajectories of observer for bearings-only tracking by quantization. In 2017 20th International Conference on Information Fusion (Fusion) (pp. 1-7). IEEE.
Compétences
The candidate should have a solid background in probability theory and notably in the theory of Markov processes. Previous experience of a course in control theory (deterministic or stochastic) would be a plus. The ability to develop numerical examples is also expected.
Avantages
- Subsidized meals
- Partial reimbursement of public transport costs
- Possibility of teleworking and flexible organization of working hours
- Professional equipment available (videoconferencing, loan of computer equipment, etc.)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage
Rémunération
- 2100€ / month (before taxs) during the first 2 years,
- 2190€ / month (before taxs) during the third year.
Informations générales
- Thème/Domaine :
Approches stochastiques
Statistiques (Big data) (BAP E) - Ville : Talence
- Centre Inria : Centre Inria de l'université de Bordeaux
- Date de prise de fonction souhaitée : 2024-10-01
- Durée de contrat : 3 ans
- Date limite pour postuler : 2024-05-03
Attention: Les candidatures doivent être déposées en ligne sur le site Inria. Le traitement des candidatures adressées par d'autres canaux n'est pas garanti.
Consignes pour postuler
Thank you to send:
- CV
- Cover letter
- Master marks and ranking
- Support letter(s)
Sécurité défense :
Ce poste est susceptible d’être affecté dans une zone à régime restrictif (ZRR), telle que définie dans le décret n°2011-1425 relatif à la protection du potentiel scientifique et technique de la nation (PPST). L’autorisation d’accès à une zone est délivrée par le chef d’établissement, après avis ministériel favorable, tel que défini dans l’arrêté du 03 juillet 2012, relatif à la PPST. Un avis ministériel défavorable pour un poste affecté dans une ZRR aurait pour conséquence l’annulation du recrutement.
Politique de recrutement :
Dans le cadre de sa politique diversité, tous les postes Inria sont accessibles aux personnes en situation de handicap.
Contacts
- Équipe Inria : ASTRAL
-
Directeur de thèse :
Genadot Alexandre / alexandre.genadot@inria.fr
A propos d'Inria
Inria est l’institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l’interface d’autres disciplines. L’institut fait appel à de nombreux talents dans plus d’une quarantaine de métiers différents. 900 personnels d’appui à la recherche et à l’innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'efforce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.