2018-01037 - Post-Doctorant / Converged Storage for Joint HPC and Big Data Processing

Type de contrat : CDD de la fonction publique

Niveau de diplôme exigé : Thèse ou équivalent

Fonction : Post-Doctorant

A propos du centre ou de la direction fonctionnelle

Le centre Inria Rennes - Bretagne Atlantique est un des huit centres d’Inria et compte plus d'une trentaine d’équipes de recherche. Le centre Inria est un acteur majeur et reconnu dans le domaine des sciences numériques. Il est au cœur d'un riche écosystème de R&D et d’innovation : PME fortement innovantes, grands groupes industriels, pôles de compétitivité, acteurs de la recherche et de l’enseignement supérieur, laboratoires d'excellence, institut de recherche technologique.

Contexte et atouts du poste

The proposed position is located within the KerData (http://www.irisa.fr/kerdata/doku.php?id=kerdata) research team at Inria Rennes. Led by Gabriel Antoniu, the KerData team focuses on scalable Big Data storage and processing on clouds and post-Petascale platforms, according to the current needs and requirements of data- intensive applications.

This work will be supervised by Alexandru Costan and Gabriel Antoniu. It will be done in the context of the  Frameworks of Package of the HPC-BigData IPL, where the KerData team (Rennes) and the Zenith team (Montpellier) are collaborating. It will use a concrete application scenario available thanks to the Pl@ntNet application – one of the reference applications of the HPC-Big Data IPL. This work will be complementary to the work plan scheduled for a research master internship position (M2 level) that is proposed by the KerData team for the 2019 summer, focusing on the HPC-Big Data convergence at processing level (combining in situ/intransit processing with stream-based processing).

The postdoc hired on this position will co-advise the master intern.

Mission confiée

In the High Performance Computing (HPC) area, the need to get fast and relevant insights from massive amounts of data generated by extreme-scale computations led to the emergence of in situ and in transit processing  approaches. They allow data to be visualized and processed in real-time, in an interactive way as they are produced, in opposition to the traditional offline analysis. In the Big Data area, however, the search for real-time analysis was materialized through a different approach: stream-based processing, which consists of treating an unbounded flow of small data items generated by many data sources arriving at high speed rates.


This illustrates how tools and cultures from HPC and Big Data Analytics (BDA) have evolved in divergent directions: essentially, they were motivated by different optimization criteria. However, recently a new converging processing paradigm is starting to be explored, by coupling state-of-the-art BDA tools with HPC simulations in order to predict and react to different situations in real-time. This new hybrid HPC / BDA paradigm is illustrated in Figure 1. There are several use cases for that type of system. For instance, for traffic flow optimization, HPC simulations can be used to predict an optimized theoretical distribution of cars on a city’s road structure and their results merged with
predictions based on historical data in order to have more accurate predictions. Another example is connected car maintenance, where HPC simulations can be used to predict stress caused to car parts and, similarly, their results can be crossed with BDA predictions calculated using historical and real-time data.


To define such a system, it is necessary to address some issues related to the integration of those two very heterogeneous worlds. In this post-doc subject, we focus on storage issues.
HPC and BDA applications have different requirements in terms of data storage. HPC applications are usually ran on supercomputers and, hence, their file systems must allow for massive concurrent accesses of processes and tasks. On the other hand, BDA applications commonly follow a “write once, ready many” model, meaning that their file systems must be optimized for multiple parallel reading operations. In this context, HPC applications commonly use parallel file systems, such as Lustre, PVFS or OrangeFS while BDA applications use distributed key-value stores like DynamoBD or blob-based storage such as Ceph.


A first storage approach for hybrid HPC/BDA systems, would be to simply use separated native solutions for HPC and BDA frameworks, namely parallel file systems and keyvalue/blob-based storage systems, respectively. In the state of the art, however, there are also hybrid approaches, i.e. file systems or converged blob-based storage systems that can be used on HPC and BDA systems, such as HDFS and Tyr, respectively. Such approaches, which had interesting preliminary performance results, open the possibility of using the same data storage strategy for HPC and BDA frameworks.


Our objective will be to analyze the available storage solutions, identify and group different extreme scale scenarios, and, for each of them, evaluate data storage systems’ performances in terms of metrics such as processing latency (time taken to calculate a prediction, cf. Figure 1) and data throughput (observed data throughput from HPC and BDA frameworks to the storage systems, cf. Figure 1).

Principales activités

Main activities:

  • Analyze and characterize existing storage solutions 
  • Performance evaluation of state-of-the-art storage solutions through large-scale experiments
  • Architecture design and implementation of a unified framework for HPC-Big Data processing
  • Large-scale evaluation of the unified framework with real applications available within the HPC-Big Data Inria Project Lab (IPL)
  • Write research papers and present them in reference international and national venues

Additional activities:

  • Co-supervision of master interns
  • Participate to meetings with parters in the HPC-Big Data IPL and related collaborative projects

 

Compétences

Applicants should have a doctoral degree in computer science and a strong background in operating systems and distributed computing. Excellent programming skills and experience with distributed experimental platforms are appreciated. Knowledge of C# is helpful.

 

Avantages sociaux

  • Restauration subventionnée
  • Transports publics remboursés partiellement

Rémunération

rémunération mensuelle brute de 2653 euros