Offre n°2023-06916

PhD Position F/M Dynamically Configurable Deep Neural Network Hardware Accelerators

Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Doctorant

A propos du centre ou de la direction fonctionnelle

The Inria Centre of Rennes University is one of Inria’s eight centres and has more than thirty research teams. The Inria Center is a major and recognized player in the field of digital sciences. It is at the heart of a rich R&D and innovation ecosystem: highly innovative PMEs, large industrial groups, competitiveness clusters, research and higher education players, laboratories of excellence, technological research institute, etc.

Contexte et atouts du poste

Fully funded 3-year Ph.D. thesis. Possibility for talented last-year Master's students to start as interns and continue after graduation as Ph.D. researchers

Context & background

State-of-the-art Deep Neural Network (DNN) models [1] usually require large amounts of data to be trained and contain a tremendous number of parameters, leading to overall high resource requirements in computation and memory and, thus, energy. In the past years, this gave rise to approaches to reducing these requirements, such as pruning [2], quantization [2], [3], or NAS [4]. Many hardware optimizations have been proposed to accelerate the inference of DNNs on different architectures, but they mainly consider static scenarios, which may not be efficient in realistic cases. For many applications, the performance requirements of a DNN model deployed on a given hardware platform are not static but evolve dynamically as its operating conditions and environment change. However, this variability and these dynamics are not addressed considering both HW and SW levels in state-of-the-art DNN approaches.

Runtime configurable hardware accelerators are necessary in a variety of scenarios. One example is Edge Computing, which involves processing data near the source of data collection instead of sending it to a centralized cloud or data center. In edge computing scenarios, there is often limited power and computing resources, and configurable hardware accelerators can help optimize the energy efficiency of AI models, leading to longer battery life and reduced energy consumption.

Opposed to conventional AI accelerators (e.g., GPUs, Google TPU), run-time configurable HW accelerators for DNNs [6] [7] [8] [5] [9] [10] can change different parameters of the computations (e.g., data type and precision, dataflow architecture, etc.) performed by the hardware accelerator at runtime. This approach allows for more efficient use of hardware resources and improved energy efficiency for AI applications by supporting operations at various precision levels, often configurable at runtime, and sparse weight storage techniques.

Hardware architecture design space exploration is usually based on high-level models of hardware operations and considers the design task a multi-dimensional optimization problem [11]. Concerning DNN hardware accelerators, previous attempts at architectural design space exploration exist [12]. Nevertheless, all the obtained designs delegate the runtime adaptation to higher levels and do not provide models to control it efficiently. Thus, an efficient and accurate approach is needed to model the impact of different configurations of DNN HW accelerator parameters on attributes of interest (e.g., inference latency and accuracy, energy consumption, etc.). This is of paramount importance to build a runtime controller able to fine-tune HW (and SW) parameters at runtime to attain specific accuracy/inference time/energy objectives.

This thesis is in the context of a broader research project, whose focus is to propose an original interdisciplinary approach – relying on complementary expertise in artificial intelligence, hardware architecture design, and control theory – that allows DNN models to be dynamically configured at runtime on a given reconfigurable hardware accelerator architecture, depending on the external environment. Such reconfigurable AI systems will require much fewer resources on average and thus allow for substantial savings in energy for many applications.

This thesis will focus on HW aspects. AI accelerators need to support different configurations of various
The major challenge is to predict the impact of different HW configurations on attributes of interest, such as the final DNN accuracy, energy, and performance. To do so, creating effective models to predict such impact is critical.


Mission confiée
Ph.D. thesis goal

The goals of this thesis are:

1. Perform an extensive design space exploration to accurately characterize the effect of different configurable HW parameters on various attributes of interest (e.g., inference latency and accuracy, energy consumption, etc.)
2. Synthesize the characterization results in a surrogate model to accurately and quickly predict the attributes of interest, even when not explicitly characterized.

Principales activités

More in detail, the Ph.D. researcher will focus on assessing the effect of different sets of HW parameters on the attributes of interest, such as inference latency, throughput, energy consumption, etc. The assessment will determine how to combine the HW parameters to provide different trade-offs among these attributes. Highly time-consuming HW syntheses and simulations are needed to assess the effect of all possible combinations of parameters of the DNN HW accelerator. Therefore, the first challenge will be to propose practical approaches to decompose the main problem into smaller ones and design and develop efficient and accurate solutions independently of the final application. The second challenge will be proposing innovative ways to adapt the assessment to the final DNN application. Then, the student will focus on modeling the effect of HW parameters on the attributes of interest. A combination of HW and SW parameters constitutes an operating configuration of the system. Simulating and enumerating in the runtime controller all possible operating configurations might not be feasible and is undoubtedly not scalable. Therefore, the third challenge will be identifying the most significant HW parameters and designing and developing predictive models (e.g., statistical or machine learning) to predict HW-level attributes of interest.

The Ph.D. researcher will regularly meet other project participants to collaborate tightly, attend regular workshops within the research project framework to share the advancements and findings with the partners, and is expected to publish and present scientific research results at prestigious venues.
We seek highly motivated and passionate candidates. Autonomy is a highly appreciated quality.

**Required Skills:**

**HW design:** VHDL/Verilog, HW synthesis flow (design, simulation, synthesis, and deployment through commercial tools for FPGA or ASIC)

**Experience in Computer architecture:** Instruction Set Architecture (ISA), Microarchitecture, and Systems design. Knowledge about hardware architectures of Neural Network accelerators is a plus.

**SW Programming/Scripting:** C/C++, Python, Linux scripting

**Experience with Deep Neural Network development frameworks:** PyTorch/TensorFlow

Experience with **High-Level Synthesis (HLS)** and related tools (e.g., Vivado/Vitis HLS or Siemens Catapult) is a plus.

**Required degree:**

Candidates must have a Master's degree (or equivalent) in **Computer Engineering** or **Electronic Engineering**.

**Talented last year Master's students may start as 6-month interns and continue as Ph.D. researchers after graduation.**

**Environment:**

The Ph.D. thesis will be carried out in collaboration with LIRIS (INSA Lyon), GIPSA-lab (Grenoble Alpes University), and Inria Lyon in the context of the ANR Project RADYAL. The Ph.D. candidate will be supervised by the TARAN team ([https://team.inria.fr/taran/](https://team.inria.fr/taran/)) at the Inria Centre at Rennes University, IRISA laboratory, in France. The Ph.D. salary will follow standard French rates.

**Ph.D. direction team:**

Supervisor: Olivier Sentieys / Olivier.Sentieys@irisa.fr
Co-supervisor: Marcello Traiola / marcello.traiola@inria.fr

**Avantages**

- Subsidized meals
- Partial reimbursement of public transport costs
- Possibility of teleworking (90 days per year) and flexible organization of working hours
- Partial payment of insurance costs

**Rémunération**

Monthly gross salary amounting to 2082 euros for the first and second years and 2190 euros for the third year

**Informations générales**

- **Thème/Domaine:** Architecture, langages et compilation
  Système & réseaux (BAP E)
- **Ville:** Rennes
- **Centre Inria:** Centre Inria de l'Université de Rennes
- **Date de prise de fonction souhaitée:** 2024-02-01
- **Durée de contrat:** 3 ans
- **Date limite pour postuler:** 2024-01-14

**Contacts**

- **Équipe Inria:** TARAN
  Directeur de thèse: Sentieys Olivier / Olivier.Sentieys@irisa.fr

**A propos d’Inria**

Inria est l’institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l’interface d’autres disciplines. L’institut fait appel à de nombreux talents dans plus d’une quarantaine de métiers différents. 900 personnels d’appui à la recherche et à l’innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de
nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'efforce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.

**Attention**: Les candidatures doivent être déposées en ligne sur le site Inria. Le traitement des candidatures adressées par d'autres canaux n’est pas garanti.

**Consignes pour postuler**

Please submit online: your resume, cover letter and letters of recommendation eventually

For more information, please contact marcello.traiola@inria.fr

**Sécurité défense** :
Ce poste est susceptible d’être affecté dans une zone à régime restrictif (ZRR), telle que définie dans le décret n°2011-1425 relatif à la protection du potentiel scientifique et technique de la nation (PPST). L'autorisation d'accès à une zone est délivrée par le chef d'établissement, après avis ministériel favorable, tel que défini dans l'arrêté du 03 juillet 2012, relatif à la PPST. Un avis ministériel défavorable pour un poste affecté dans une ZRR aurait pour conséquence l'annulation du recrutement.

**Politique de recrutement** :
Dans le cadre de sa politique diversité, tous les postes Inria sont accessibles aux personnes en situation de handicap.