Embedded Machine Learning Programming (with potential PhD continuation)
Contract type : Internship
Level of qualifications required : Master's or equivalent
Fonction : Internship Research
Context
This internship takes place in the context of a collaboration with the ASTRA joint Inria/Valeo team and with Google DeepMind on the topic of modeling and efficient implementation of complex Machine Learning applications onto embedded platforms.
Scientific context:
Conventional machine learning (ML) frameworks offer a tensor-centric view of the design and implementation of deep neural networks. But ML models do not stand by themselves as pure tensor functions. ML applications typically interact with an environment and often operate on stream of data collected and processed over time. For instance, the reactive control of a self-driving car operates on streams of data coming from sensors and controlling actuators. Training algorithms themselves embed a model into a reactive loop, itself decomposed into epochs and (mini- )batches allowing the efficient scheduling of computations and I/O, parameter updates, etc. The same applies to reinforcement learning (RL) agents. Back to the automated driving example, stateful behavior is essential to taking into account previously-inferred facts such as speed limits, whether the current lane is a left turn etc., long after the acquisition of sensor inputs. Other examples of ML components embedded into stateful reactive feedback loops include model-predictive maintenance, control, and digital twins. ML models themselves involve stateful constructs in the form of recurrent neural network (RNN) layers. When generating optimized code, even matrix products and convolutions in feedforward networks can be folded over time, using (stateful) buffering to reduce memory footprint. In distributed settings, the efficient implementation of large models involves pipelined communications and computations, which amounts to locally recovering a streaming execution pattern.
Considering this broad range of scenarios, we observe that existing ML frameworks inadequately capture reactive aspects, raising barriers between differentiable models and the associated control, optimization, and input/output code. These barriers worsen the gap between ML research and system capabilities, particularly in the area of control automation where embedded ML engineering relies on undisclosed, ad-hoc implementations.
In previous work, we have proposed a reactive language, named MLR, integrating ML-specific constructs (such as bidirectional recurrences or the tensorial operations) and activities (such as automatic differentiation). We have also shown that for applications without bidirectional recurrences reactiveness does not penalize performance.
Assignment
The objective of this internship and of the potential PhD follow-up is to advance on either, or both the MLR language design and the MLR compilation fronts.
- On the language design (syntax and semantics) side, of particular interest is the introduction of iterators allowing for seamless conversion of iterations performed in time, on streams, into iterations performed in space, on tensors. Such transformations are needed both at high level, e.g. to introduce a "batch" dimension into a computation, and at low level, e.g. to specify how a large tensorial operation is decomposed for execution onto hardware.
- On the compilation side, the key difficulty is the handling of bidirectional recurrences. Classical reactive formalisms such as Lustre can be compiled into very efficient, statically-scheduled code running in constant memory, without buffering. By comparison, the ML-specific bidirectional recurrences implicitly require buffering and dynamic scheduling (like the tape-based methods used during training). Replacing this implicit buffering with explicit, efficient and bounded buffering under a mostly-static scheduling has the potential to largely improve the performance and predictibility of generated code.
In both cases, the internship will start with the analysis and MLR modeling of a complex ML application that will be used as main use case: a Reinforcement Learning-based Autonomous Driving (AD) application from the automotive domain.
The internship will involve regular interactions with:
- Google DeepMind for the language design and compilation work.
- Our automotive partners (the ASTRA team and Valeo) for the evaluation of MLR on the AD use case.
Contact: More information on the internship offer can be obtained by contacting dumitru.potop@inria.fr
Main activities
Main activities :
- State of the art analysis
- Use case modeling and evaluation
- Proposal of language extensions and compilation methods
- Participating to the writing of a research paper
Skills
Languages : Proficiency in either French or English is required.
Benefits package
- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking
- Professional equipment available (videoconferencing, loan of computer equipment, etc.)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage
General Information
- Town/city : Paris
- Inria Center : Centre Inria de Paris
- Starting date : 2025-03-01
- Duration of contract : 6 months
- Deadline to apply : 2024-11-30
Warning : you must enter your e-mail address in order to save your application to Inria. Applications must be submitted online on the Inria website. Processing of applications sent from other channels is not guaranteed.
Instruction to apply
Defence Security :
This position is likely to be situated in a restricted area (ZRR), as defined in Decree No. 2011-1425 relating to the protection of national scientific and technical potential (PPST).Authorisation to enter an area is granted by the director of the unit, following a favourable Ministerial decision, as defined in the decree of 3 July 2012 relating to the PPST. An unfavourable Ministerial decision in respect of a position situated in a ZRR would result in the cancellation of the appointment.
Recruitment Policy :
As part of its diversity policy, all Inria positions are accessible to people with disabilities.
Contacts
- Inria Team : AT-PRO AE
-
Recruiter :
Potop-butucaru Dumitru / Dumitru.Potop_Butucaru@inria.fr
The keys to success
We are seeking a student that is highly motivated to do research at the intersection of Machine Learning, programming languages, and embedded systems.
About Inria
Inria is the French national research institute dedicated to digital science and technology. It employs 2,600 people. Its 200 agile project teams, generally run jointly with academic partners, include more than 3,500 scientists and engineers working to meet the challenges of digital technology, often at the interface with other disciplines. The Institute also employs numerous talents in over forty different professions. 900 research support staff contribute to the preparation and development of scientific and entrepreneurial projects that have a worldwide impact.