PhD Position F/M PhD Position - Procedural reasoning data generation for advanced logics and planning
Type de contrat : CDD
Niveau de diplôme exigé : Bac + 5 ou équivalent
Fonction : Doctorant
A propos du centre ou de la direction fonctionnelle
Created in 2008, the Inria center at the University of Lille employs 360 people, including 305 scientists in 16 research teams. Recognized for its strong involvement in the socio-economic development of the Hauts-De-France region, the Inria center at the University of Lille maintains a close relationship with large companies and SMEs. By fostering synergies between researchers and industry, Inria contributes to the transfer of skills and expertise in the field of digital technologies, and provides access to the best of European and international research for the benefit of innovation and businesses, particularly in the region.
For over 10 years, the Inria center at the University of Lille has been at the heart of Lille's university and scientific ecosystem, as well as at the heart of Frenchtech, with a technology showroom based on avenue de Bretagne in Lille, on the EuraTechnologies site of economic excellence dedicated to information and communication technologies (ICT).
Contexte et atouts du poste
Benchmarks for Language Models
Hosting team: Inria Lille, CRIStAL (UMR 9189) Supervisor: Damien Sileo Duration: 36 months Project: TACTICS (PEPR)
Context
Language models have become surprisingly capable reasoners, but progress is bottlenecked by data. Both training (SFT, RLVR) and evaluation rely on problem sets with known answers, and the supply of high-quality, uncontaminated, difficulty-calibrated reasoning problems is running thin. Hand-curated benchmarks saturate quickly, leak into pretraining corpora, and cannot be regenerated at will. Web-scraped reasoning data carries licensing baggage and offers no correctness guarantees.
Procedural generation offers a way out. By coupling a problem generator with a symbolic solver, one can produce an effectively unbounded stream of fresh instances, each shipped with a certified solution and a tunable difficulty knob. The same pipeline serves three purposes at once: pretraining and SFT data with reasoning traces, verifiable rewards for RL, and contamination-free evaluation suites. Our group has been building this line of work through the Reasoning Core library, which already covers PDDL planning over randomized domains, full first-order logic, CFG parsing, causal inference over Bayesian networks, and equation solving.
Thesis topic
The PhD will extend this infrastructure toward harder formalisms and richer reasoning regimes. The core technical question is how to turn a solver plus a formalism into a good procedural generator: one whose output distribution is broad rather than templated, whose difficulty scales smoothly, and whose problems remain well-posed and human-readable as complexity grows.
Several directions are open and will be shaped with the candidate:
- Advanced logics and planning. Hierarchical planning (HTN/HDDL), multi-agent and epistemic planning, temporal logics, and modal logics for reasoning about knowledge and belief. Each formalism has mature solvers but no scalable generator that exposes their full expressive range.
- From solver to generator. Most solvers are built to consume problems, not produce them. Sampling instances that are non-trivial, non-degenerate, solvable within a budget, and distributionally diverse is a research problem in itself, and it gets harder as the formalism grows.
- Verbalization and alignment with human priors. A formal instance is only useful if its natural-language rendering is faithful, varied, and not artificially easy or hard. This involves grammar-based verbalization, controlling surface form independently of logical content, and understanding how phrasing interacts with model inductive biases.
- Pushing scalability where solvers struggle. At high difficulty, exact solvers time out. The thesis will explore approximate verification, compositional generation, and incremental construction strategies that preserve correctness guarantees while extending the reachable difficulty range.
Practical
- Location: Inria Lille, France
- Funding: full PhD scholarship (PEPR TACTICS), plus compute budget for frontier model inference
- Start date: flexible, ideally October 2026
Mission confiée
The work will contribute directly to the Reasoning Core https://github.com/sileod/reasoning-core ecosystem and benefit from its existing infrastructure (gramforge grammar framework, containerized solvers, parallel generation pipeline). Evaluation will run on two fronts: zero-shot probing of frontier models to measure where current systems break, and supervised fine-tuning of small models to measure whether the generated data actually instills reasoning capabilities. Both regimes inform what makes a generator useful, and they often disagree in instructive ways.
Principales activités
Study related work
Propose new approaches
Evaluate them, iterate
Write papers
Compétences
Languages :
- Good written English; French not required
- Solid Python; comfort with formal methods, logic, or symbolic AI is a strong plus
Good relational skills
Avantages
- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking and flexible organization of working hours
- Professional equipment available (videoconferencing, loan of computer equipment, etc.)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage
Rémunération
2 300 € Monthly Gross Salary
Informations générales
- Thème/Domaine :
Représentation et traitement des données et des connaissances
Statistiques (Big data) (BAP E) - Ville : Villeneuve d'Ascq
- Centre Inria : Centre Inria de l'Université de Lille
- Date de prise de fonction souhaitée : 2026-10-01
- Durée de contrat : 3 ans
- Date limite pour postuler : 2026-06-06
Attention: Les candidatures doivent être déposées en ligne sur le site Inria. Le traitement des candidatures adressées par d'autres canaux n'est pas garanti.
Consignes pour postuler
Application: CV, transcripts, and a short statement of interest to damien.sileo@inria.fr
Sécurité défense :
Ce poste est susceptible d’être affecté dans une zone à régime restrictif (ZRR), telle que définie dans le décret n°2011-1425 relatif à la protection du potentiel scientifique et technique de la nation (PPST). L’autorisation d’accès à une zone est délivrée par le chef d’établissement, après avis ministériel favorable, tel que défini dans l’arrêté du 03 juillet 2012, relatif à la PPST. Un avis ministériel défavorable pour un poste affecté dans une ZRR aurait pour conséquence l’annulation du recrutement.
Politique de recrutement :
Dans le cadre de sa politique diversité, tous les postes Inria sont accessibles aux personnes en situation de handicap.
Contacts
- Équipe Inria : MAGNET
-
Directeur de thèse :
Sileo Damien / damien.sileo@inria.fr
L'essentiel pour réussir
- Master's degree in CS, ML, NLP, or a closely related area
- Computer science/math background (logicians/theoretical/verification CS people welcome)
- Familiarity with at least one of: automated planning, theorem proving, formal grammars
- Interest in working at the boundary between symbolic systems and neural models
- Good coding proficiency
- Good mastery and interest in formal methods (LLM enthousiasm is not enough)
- Scientific mindset
A propos d'Inria
Inria est l’institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l’interface d’autres disciplines. L’institut fait appel à de nombreux talents dans plus d’une quarantaine de métiers différents. 900 personnels d’appui à la recherche et à l’innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'efforce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.