Embedded Machine Learning Programming (with potential PhD continuation)
Type de contrat : Stage
Niveau de diplôme exigé : Bac + 4 ou équivalent
Fonction : Stagiaire de la recherche
Contexte et atouts du poste
This internship takes place in the context of a collaboration with the ASTRA joint Inria/Valeo team and with Google DeepMind on the topic of modeling and efficient implementation of complex Machine Learning applications onto embedded platforms.
Scientific context:
Conventional machine learning (ML) frameworks offer a tensor-centric view of the design and implementation of deep neural networks. But ML models do not stand by themselves as pure tensor functions. ML applications typically interact with an environment and often operate on stream of data collected and processed over time. For instance, the reactive control of a self-driving car operates on streams of data coming from sensors and controlling actuators. Training algorithms themselves embed a model into a reactive loop, itself decomposed into epochs and (mini- )batches allowing the efficient scheduling of computations and I/O, parameter updates, etc. The same applies to reinforcement learning (RL) agents. Back to the automated driving example, stateful behavior is essential to taking into account previously-inferred facts such as speed limits, whether the current lane is a left turn etc., long after the acquisition of sensor inputs. Other examples of ML components embedded into stateful reactive feedback loops include model-predictive maintenance, control, and digital twins. ML models themselves involve stateful constructs in the form of recurrent neural network (RNN) layers. When generating optimized code, even matrix products and convolutions in feedforward networks can be folded over time, using (stateful) buffering to reduce memory footprint. In distributed settings, the efficient implementation of large models involves pipelined communications and computations, which amounts to locally recovering a streaming execution pattern.
Considering this broad range of scenarios, we observe that existing ML frameworks inadequately capture reactive aspects, raising barriers between differentiable models and the associated control, optimization, and input/output code. These barriers worsen the gap between ML research and system capabilities, particularly in the area of control automation where embedded ML engineering relies on undisclosed, ad-hoc implementations.
In previous work, we have proposed a reactive language, named MLR, integrating ML-specific constructs (such as bidirectional recurrences or the tensorial operations) and activities (such as automatic differentiation). We have also shown that for applications without bidirectional recurrences reactiveness does not penalize performance.
Mission confiée
The objective of this internship and of the potential PhD follow-up is to advance on either, or both the MLR language design and the MLR compilation fronts.
- On the language design (syntax and semantics) side, of particular interest is the introduction of iterators allowing for seamless conversion of iterations performed in time, on streams, into iterations performed in space, on tensors. Such transformations are needed both at high level, e.g. to introduce a "batch" dimension into a computation, and at low level, e.g. to specify how a large tensorial operation is decomposed for execution onto hardware.
- On the compilation side, the key difficulty is the handling of bidirectional recurrences. Classical reactive formalisms such as Lustre can be compiled into very efficient, statically-scheduled code running in constant memory, without buffering. By comparison, the ML-specific bidirectional recurrences implicitly require buffering and dynamic scheduling (like the tape-based methods used during training). Replacing this implicit buffering with explicit, efficient and bounded buffering under a mostly-static scheduling has the potential to largely improve the performance and predictibility of generated code.
In both cases, the internship will start with the analysis and MLR modeling of a complex ML application that will be used as main use case: a Reinforcement Learning-based Autonomous Driving (AD) application from the automotive domain.
The internship will involve regular interactions with:
- Google DeepMind for the language design and compilation work.
- Our automotive partners (the ASTRA team and Valeo) for the evaluation of MLR on the AD use case.
Contact: More information on the internship offer can be obtained by contacting dumitru.potop@inria.fr
Principales activités
Main activities :
- State of the art analysis
- Use case modeling and evaluation
- Proposal of language extensions and compilation methods
- Participating to the writing of a research paper
Compétences
Languages : Proficiency in either French or English is required.
Avantages
- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking
- Professional equipment available (videoconferencing, loan of computer equipment, etc.)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage
Informations générales
- Ville : Paris
- Centre Inria : Centre Inria de Paris
- Date de prise de fonction souhaitée : 2025-03-01
- Durée de contrat : 6 mois
- Date limite pour postuler : 2024-11-30
Attention: Les candidatures doivent être déposées en ligne sur le site Inria. Le traitement des candidatures adressées par d'autres canaux n'est pas garanti.
Consignes pour postuler
Sécurité défense :
Ce poste est susceptible d’être affecté dans une zone à régime restrictif (ZRR), telle que définie dans le décret n°2011-1425 relatif à la protection du potentiel scientifique et technique de la nation (PPST). L’autorisation d’accès à une zone est délivrée par le chef d’établissement, après avis ministériel favorable, tel que défini dans l’arrêté du 03 juillet 2012, relatif à la PPST. Un avis ministériel défavorable pour un poste affecté dans une ZRR aurait pour conséquence l’annulation du recrutement.
Politique de recrutement :
Dans le cadre de sa politique diversité, tous les postes Inria sont accessibles aux personnes en situation de handicap.
Contacts
- Équipe Inria : AT-PRO AE
-
Recruteur :
Potop-butucaru Dumitru / Dumitru.Potop_Butucaru@inria.fr
L'essentiel pour réussir
We are seeking a student that is highly motivated to do research at the intersection of Machine Learning, programming languages, and embedded systems.
A propos d'Inria
Inria est l’institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l’interface d’autres disciplines. L’institut fait appel à de nombreux talents dans plus d’une quarantaine de métiers différents. 900 personnels d’appui à la recherche et à l’innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'efforce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.