2018-00363 - PhD Position : "Design-space exploration of fault-tolerant multicores"

Level of qualifications required : Graduate degree or equivalent
Other valued qualifications : Master
Fonction : PhD Position
Level of experience : Recently graduated

About the research centre or Inria department
The Cairn project-team researches new architectures, algorithms and design methods for flexible and energy efficiency domain-specific system-on-chip (SoC). As performance and energy-efficiency requirements of SoCs are continuously increasing, they become difficult to fulfill using only programmable processors solutions. To address this issue, we promote/advocate the use of reconfigurable hardware, i.e. hardware structures whose organization may change before or even during execution. Such reconfigurable SoCs offer high performance at a low energy cost, while preserving a high level of flexibility.

The group studies these SoCs from three angles: (i) The invention and design of new reconfigurable platforms with an emphasis on flexible arithmetic operator design, dynamic reconfiguration management and low- power consumption. (ii) The development of their corresponding design flows (compilation and synthesis tools) to enable their automatic design from high-level specifications. (iii) The interaction between algorithms and architectures especially for our main application domains (wireless communications, wireless sensor networks and digital security). The team has been created in 2008 and is a "reconfiguration" of the former R2D2 research team from Irisa.

Context
The consumer market has shifted towards multicores architectures, since the clock speeds of the single processors could not be further increased due to power consumption and heat dissipation limits [6]. Multicores provide Space, Weight and Power reductions (SWaP) and massive computing capabilities compared with single core processors, while they can integrate diverse applications on the same platform [1]. However, the reduction of the transistors size with technologies at 28nm and below has led the multicores to become more and more sensible to the environmental impacts [2], such as ionizing, particle and high-energy electromagnetic radiation, extreme weather conditions, high temperature peaks and electromagnetic interferences. Such stimuli trigger violations on the system impacting the normal system functionality and creating faults during its operation [3]. To provide correct system functionality, the reliability of multicores architectures has become a very essential aspect. Several different fault tolerant approaches have been proposed in the literature to improve the system reliability. However, no general solution can exist to provide the required reliability in low cost for all the problems under study. The promising fault tolerant method is determined by the real faults occurring during execution, the application and the platform of each problem under study.

Assignment
Assignments: 3-year PhD Thesis

This PhD focuses on fault tolerant multi-core architectures and has as main goals: 1) to gain insight on the impact of faults on multicores architectures in order to model the impact of simple (SEU, SET) and multiple (MBU) errors at different levels of abstraction, and 2) to design and develop a novel method to explore the design space of the promising set of fault tolerant techniques.

Main activities
During the first part of this thesis, we will study the impact of faults on the basic components of a multicores architecture, i.e. the memory, the core and the interconnection, based on a shared-memory multicores based on RISC-V cores specified at the C-level through high-level synthesis and designed with a 28nm technology. To achieve this, we require to develop models to describe the faulty behaviors of these components by raising the abstraction of the existing fault models on the gate level and up to the architecture level.

During the second part, we will define the set of relevant fault tolerant techniques within our domain and classify these methods into a binary classification scheme. Each of the classes will be characterized with respect to the reliability that they can offer and the overhead that they impose on the design (performance, area, energy). The different possible fault scenarios, based on the abstract models developed during the first part, will be mapped with the corresponding fault tolerant classes. In the next step we will focus on defining a novel design space exploration methodology and designing the corresponding tools in order to efficiently explore the different fault tolerant design options. The methodology will be based on pruning methods over the binary classification and optimizations strategies. The results of the proposed methodology are the set of the most promising fault tolerant approaches under given fault scenarios and platform characteristics that reduce the system cost, while providing reliability and real-time guarantees. A RISC-V multicores architecture will be used to perform the evaluation of the proposed methodology.

This thesis is funded by a project involving INRIA, ONERA, and Temento Systems.

References

Skills
The student is expected to develop techniques for design space exploration of computer architectures and fault tolerance. We also expect to have prototype implementations of the developed techniques on FPGA and ASIC. The designs will primarily be done through High-Level Synthesis tools.

Desired skills include:
- Computer architecture, hardware design, VLSI circuit design.
- Basic knowledge in compilers, fault tolerance.
- Familiarity with the C/C++ language or other languages.
- Familiarity with FPGA design and/or High-Level Synthesis.

Most importantly, we seek highly motivated and active students.

Benefits package
- Subsidised catering service
- Partially-reimbursed public transport
- Social security
- Paid leave
- Flexible working hours
- Sports facilities

Remuneration
Monthly gross salary amounting to 1982 euros for the first and second years and 2085 euros for the third year

General Information
- Theme/Domain: Architecture, Languages and Compilation
- IT Infrastructure (BAP E)
- Town/City: Rennes
- Inria Center: CRI Rennes - Bretagne Atlantique
- Starting date: 2018-04-01
- Duration of contract: 3 years
- Deadline to apply: 2018-06-30

Contacts
- Inria Team: CAIRN
- Recruiter: Sentieys Olivier / olivier.sentieys@irisa.fr

About Inria
Inria, the French National Institute for computer science and applied mathematics, promotes “scientific excellence for technology transfer and society”. Graduates from the world’s top universities, Inria’s 2,700 employees rise to the challenges of digital sciences. With its open, agile model, Inria is able to explore original approaches with its partners in industry and academia and provide an efficient response to the multidisciplinary and application challenges of the digital transformation. Inria is the source of many innovations that add value and create jobs.

Conditions for application
Please submit online: your resume, cover letter and letters of recommendation.

For further information, please contact Olivier Sentieys (olivier.sentieys@irisa.fr)

Defence Security:
This position is likely to be situated in a restricted area (ZRR), as defined in Decree No. 2011-1425 relating to the protection of national scientific and technical potential (PPST). Authorisation to enter an area is granted by the director of the unit, following a favourable Ministerial decision, as defined in the decree of 3 July 2012 relating to the PPST. An unfavourable Ministerial decision in respect of a position situated in a ZRR would result in the cancellation of the appointment.

Recruitment Policy:
As part of its diversity policy, all Inria positions are accessible to people with disabilities.

Warning: you must enter your e-mail address in order to save your application to Inria. Applications must be submitted online on the Inria website. Processing of applications sent from other channels is not guaranteed.