PhD Position F/M Compilation of a DSL based on transducers to SIMD optimized programs

Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Doctorant

A propos du centre ou de la direction fonctionnelle

The Inria research centre in Lyon is the 9th Inria research centre, formally created in January 2022.  It brings together approximately 320 people in 19 research teams and research support services.

Its staff are distributed in Villeurbanne, Lyon Gerland, and Saint-Etienne.

The Lyon centre is active in the fields of software, distributed and high-performance computing, embedded systems, quantum computing and privacy in the digital world, but also in digital health and computational biology.

Contexte et atouts du poste

This PhD is part of the larger Shanon meet Cray, or SxC, which aims to facilitate the writing of SIMD programs and to pave the way to future auto-vectorization methods for stream processing.

This PhD will be co-advise by Gabriel Radanne (INRIA Lyon) and Charles Paperman (Université de Lille) and can be located either in Lyon, with visits in Lille during the PhD.

Mission confiée

Efficient Data Processing

Streaming data processing is a crucial approach that focuses on traversing data to extract pertinent information. Applications ranges from network packet manipulation to analysing DNA. Modern data-processing tools heavily depend on efficient implementations that harness hardware acceleration to achieve high performance. This acceleration can sometimes be achieved through automatic compilation, but frequently demands expert developers to craft optimizations by hand.

One critical facet of this optimization process involves SIMD optimization, where data is packed into chunks and processed with minimal branching in the code, often using bitvector operations. These optimizations are at the core of numerous well-known software applications, such as regular expression matching in tools like ripgrep, JSON parsing in libraries like SimdJSON, and even fundamental operations like string encoding and decoding (unicode parsing). Developing these optimizations requires a broad skill set and is a testament to the expertise of programmers worldwide.

Exploring a Restricted Programming Language

During this PhD, we will explore the design and implementation of a specialized programming language for stream processing and its compilation to efficient SIMD code. The technics will take inspiration of real software design (such as rsonpath) and will be based on abstract automata theory and logic approach. Initially we will focus on a limited expressivity class, named LTL, whose theoretical properties are well understood.

Following this initial exploration, depending on the interest of the student, we will focus either on: - Extending our design to other language constructs that admit efficient vectorised implementation - Formally study the link between our language and finite state transducers and their expressivity - Output real-world efficient code in existing instruction sets.

Principales activités

Tasks

  • Play with handcrafted efficient SIMD code
  • Design a small subset of the language and a toy compilation scheme
  • Benchmarks the resulting code on concrete example
  • Eventually: Study algebraically the class of function designed
  • Eventually: Expand the language with more operations

 

Compétences

Candidate profile

The candidate should ideally be familiar with formal approaches in programming language design, notably type systems, semantics, and logic. From the practical point of view, a basic experience in software programming and usage of collaborative tools such that git. This PhD strongly relies on the fact that practical implementation should have strong theoretical foundations and that further refinements of the theory should get inspiration from the practical side. We expect the candidate to agree with this philosophy

Avantages

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage

Rémunération

2200 euros gross salary /month