2023-05774 - Post-Doctoral Research Visit F/M Data models and computational cost estimates for European AI sovereignty

Contract type : Fixed-term contract

Renewable contract : Oui

Level of qualifications required : PhD or equivalent

Fonction : Post-Doctoral Research Visit

Level of experience : Recently graduated

Context

This post-doctoral position is part of the European project EICACS on standardization for collaborative air combat, financed by the European Defense Fund, in which Inria is responsible for Work package focusing on European sovereignty of AI. This large-scale project, coordinated by Dassault Aviation, involves partners from  industry and academia from 10 European countries.

The work coordinated by Inria aims to list the tools, methods and public libraries of AI, in order to conduct a risk analysis of sovereignty from a technical point of view, quantify the loss of profit when a sovereign technology is less efficient than its non-sovereign equivalent, and develop recommendations for R&D work necessary to improve the sovereignty of the AI landscape at the European level.


The ambition is to answer questions such as:

  • If a library is open source but maintained by a non-European company (example of Tensorflow), does using it in our European systems still create a foreign dependency? Is it easy to substitute another library a posteriori? Could a political decision by the company maintaining the solution prevent us from developing certain functions in the future? If a change in the conditions of use makes it impossible to use future updates of the library in our systems, what risks (functional or security) do we face?
  • What features do European libraries lack to support the implementation of the latest state-of-the-art methods? What open source developments (abstraction layers, bindings...) would be necessary to be able to substitute them more easily to non-sovereign libraries?
  • If a resource (a dataset, a pre-trained model) is created or distributed by a non-European actor, how to ensure that it has not been maliciously modified by its author (data poisoning, backdoor...)?
  • If for these reasons one chooses not to use a certain pre-trained model that is a standard in the community (e.g. BERT), what loss of performance will there be? Do some functions become impossible to develop? What is the cost of reproducing the same model at the European level? How to prioritize between the models to be reproduced?
  • How to make the most of the resources already existing at the European level? What common data models should be used to assemble these unitary resources into more massive unified resources, and how should future developments be federated?
  • Does the use of certain tools or libraries require the use of components (hardware architecture, specific graphics card) that are only produced in a certain non-European country? What is the loss of performance in case of replacement by another component? Could it be compensated by more hardware investments (and at what cost)?

The spectrum of study being particularly wide, answers to these questions will be elaborated by targeting the analysis on a certain set of functions and libraries identified as priorities (on the basis of use cases provided by the partners).

This post-doctoral fellowship will focus more specifically on aspects related to data models and computational cost estimates. The post-doctoral fellow will work under the supervision of Lauriane Aufrant (AI researcher at Inria Defense & Security), and in close collaboration with the academic and industrial partners of the project.

Assignment

The post-doc will start with a series of bibliographic studies, aiming at identifying the set of relevant methods to be considered for risk analysis (as they meet the functional needs and technical constraints of the use cases, and therefore to be coordinated with the industrial partners), as well as the existing tools and libraries necessary or useful to implement these methods. Particular emphasis will be put on the identification of frameworks that already provide an abstraction and interoperability layer to the models, and the characterization of their capabilities.

The work will then consist in coming back to the different tools or methods of this list, in order to carry out a sovereignty analysis based on experimental arguments: for example, to carry out an experiment of reproduction of a large pre-trained model but on a smaller scale, in order to estimate what would be the cost of reproduction of the complete model (computational cost but also environmental, human, financial cost...).


Finally, it will be necessary to formulate concrete proposals for actions to make the European AI landscape more sovereign, for example by proposing formalisms (new data models, abstractions unifying different methods, etc.) making sovereign components more interoperable with their equivalents.

The results of these studies will be valorized through scientific publications, in particular reviews, position papers, reproduction studies, definition of new formalisms, etc.

Main activities

  • Bibliographic research
  • Needs analysis
  • Experiments and computational cost estimates
  • Drafting of recommendations
  • Publication of scientific articles

Skills

  • PhD. in artificial intelligence, or about to obtain one
  • Theoretical and practical knowledge of deep learning, but also of other machine learning and symbolic AI methods
  • Good programming skills, comfortable with rapid implementation of experiments
  • Ability to efficiently perform a state-of-the-art review on various topics
  • Willingness to diversify skills and knowledge by exploring multiple areas of AI

Fluency in French and English.

Benefits package

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training