Type de contrat : CDD
Niveau de diplôme exigé : Bac + 5 ou équivalent
Fonction : Doctorant
Niveau d'expérience souhaité : Jeune diplômé
A propos du centre ou de la direction fonctionnelle
The Inria Rennes - Bretagne Atlantique Centre is one of Inria's eight centres and has more than thirty research teams. The Inria Center is a major and recognized player in the field of digital sciences. It is at the heart of a rich R&D and innovation ecosystem: highly innovative PMEs, large industrial groups, competitiveness clusters, research and higher education players, laboratories of excellence, technological research institute, etc.
Contexte et atouts du poste
Financing Project
This PhD will be done in the context of the ACROSS EuroHPC project (2021-2023), focused on enabling efficient execution of complex workflows combining simulation, analytics and learning across hybrid infrastructures (HPC/cloud/edge).
Mission confiée
Introduction
As Artificial Intelligence has recently gained an unprecedented momentum in a rapidly increasing number of application areas, Deep Neural Networks (DNN) are becoming a pervasive tool across a large range of domains, including autonomous driving vehicle, industrial automation, and pharmaceutical research to name just a few.
As these neural network architectures and their training data are getting more and more complex, so are the infrastructures that are needed to execute them sufficiently fast. Hyperparameter setting and tuning, training, inference, dataset handling are operations that are all putting a growing pressure on the underlying compute infrastructure and call for novel approaches at all levels of the workflow, including the algorithmic level, the middleware and deployment level, and the resource optimization level.
Thesis proposal
In this thesis we focus on the middleware and the deployment level. Understanding end-to-end performance of complex AI workloads deployed on a digital continuum that may include hybrid resources (HPC systems, clouds, edge devices) is challenging. This breaks down to conciliating many, typically contradicting constraints with low-level infrastructure design choices. One important challenge is to enable accurate, reproducible experimental investigation of relevant behaviors of a given application workflow and representative settings of the physical infrastructure. This includes automated experiment configuration at scale based a set of scenario deployments previously identified, experiment execution on large testbeds (e.g., Grid’5000), metrics collection and analysis, management of experimental artifacts to ensure repeatability, replicability and reproducibility.
Principales activités
To address these challenges, we will define an experimental framework and a methodology leveraging the E2Clab approach [E2Clab2020, Ros2020] initiated in the KerData team at Inria, and extend it to cover the complete computing continuum. In particular, E2Clab will be extended GPU virtualization, containerization or the support for microservice architectures. Our goal is to enable reproducible experimentation of complex AI workloads across hybrid infrastructures and help optimize deployment strategies depending on multiple factors including the application characteristics, the target performance metrics and the features of the available execution hardware. The goal is to answer questions like: How can the various possible deployment options of complex AI workflows on the available underlying infrastructure impact performance metrics? How can this infrastructure be best leveraged in practice, potentially through seamless integration of supercomputers, clouds, and fog/edge systems?
The main expected outcomes are: (1) an experimental, reproducibility-oriented methodology and its validation in practice through novel insights it can enable (e.g., through the experimentation of alternative scheduling strategies), and (2) an associated underlying software framework for experiment deployment, monitoring, and execution at scale on various relevant scalable infrastructures.
International visibility and mobility
The thesis will be conducted in collaboration with several partners including DFKI (René Schubotz) and the University of Düsseldorf (Michael Schöttner).
References
[Ros2020] Daniel Rosendo, Pedro Silva, et al. (2020) E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproducible Edge-to-Cloud Experiments. Cluster 2020 - IEEE International Conference on Cluster Computing, Sep 2020, Kobe, Japan.
[E2Clab2020] The E2Clab project: https://team.inria.fr/kerdata/e2clab/.
[G5K] The Grid’5000 experimental testbed: https://www.grid5000.fr/w/Grid5000:Home.
Compétences
- Strong knowledge of computer networks and distributed systems
- Knowledge on storage and (distributed) file systems
- Ability and motivation to conduct high-quality research, including publishing the results in relevant venues
- Strong programming skills (e.g. C/C++, Java, Python).
- Working experience in the areas of Big Data management, Cloud computing, HPC, is an advantage
Avantages
- Subsidised catering service
- Partially-reimbursed public transport
Rémunération
monthly gross salary amounting to 1982 euros for the first and second years and 2085 euros for the third year
Partager
Informations générales
- Thème/Domaine :
Calcul distribué et à haute performance
Système & réseaux (BAP E) - Ville : Rennes
- Centre Inria : CRI Rennes - Bretagne Atlantique
- Date de prise de fonction souhaitée : 2021-01-01
- Durée de contrat : 3 ans
- Date limite pour postuler : 2021-03-03
Contacts
- Equipe Inria : KERDATA
-
Directeur de thèse :
Costan Alexandru / Alexandru.Costan@irisa.fr
L'essentiel pour réussir
- An excellent Master degree in computer science or equivalent (e.g. engineering)
- Very good communication skills in oral and written English.
- Open-mindedness, strong integration skills and team spirit
A propos d'Inria
Inria est l’institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 200 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3500 scientifiques pour relever les défis du numérique, souvent à l’interface d’autres disciplines. L’institut fait appel à de nombreux talents dans plus d’une quarantaine de métiers différents. 900 personnels d’appui à la recherche et à l’innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 180 start-up. L'institut s'efforce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.
Consignes pour postuler
Please submit online : your resume, cover letter and letters of recommendation eventually
For more information, please contact costan.alexandru@inria.fr
Sécurité défense :
Ce poste est susceptible d’être affecté dans une zone à régime restrictif (ZRR), telle que définie dans le décret n°2011-1425 relatif à la protection du potentiel scientifique et technique de la nation (PPST). L’autorisation d’accès à une zone est délivrée par le chef d’établissement, après avis ministériel favorable, tel que défini dans l’arrêté du 03 juillet 2012, relatif à la PPST. Un avis ministériel défavorable pour un poste affecté dans une ZRR aurait pour conséquence l’annulation du recrutement.
Politique de recrutement :
Dans le cadre de sa politique diversité, tous les postes Inria sont accessibles aux personnes en situation de handicap.
Attention: Les candidatures doivent être déposées en ligne sur le site Inria. Le traitement des candidatures adressées par d'autres canaux n'est pas garanti.