2022-05221 - Doctorant F/H Management of mutable data over P2P storage

Type de contrat : CDD

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Doctorant

Contexte et atouts du poste

This PhD thesis will be in the context of a collaboration between HIVE and Coast and Myriads Inria teams. The Ph.D student will be located at Inria Nancy-Grand Est and will be visiting team Myriads at Inria Center of the University of Rennes and the Hive offices in Cannes.


About Hive:

Hive intends to play the role of a next generation cloud provider in the context of Web 3.0. Hive aims to exploit the unused capacity of computers to offer the general public a greener and more sovereign alternative to the existing clouds where the true power lies in the hands of the users. It relies both on distributed peer-to-peer networks, on the encryption of end-to-end data and on blockchain technology.

About Inria Nancy - Grand Est:

The Inria Nancy - Grand Est center is one of Inria's eight centers and has twenty project teams, located in Nancy, Strasbourg and Saarbrücken. Its activities occupy over 400 people, scientists and research and innovation support staff, including 45 different nationalities. The Inria Center is a major and recognized player in the field of digital sciences. It is at the heart of a rich R&D and innovation ecosystem: highly innovative PMEs, large industrial groups, competitiveness clusters, research and higher education players, laboratories of excellence, technological research institutes, etc.

About Inria Center of the University of Rennes:

The Inria Center of the University of Rennes is one of Inria's eight centers and has more than thirty research teams. The Inria Center is a major and recognized player in the field of digital sciences. It is at the heart of a rich R&D and innovation ecosystem: highly innovative PMEs, large industrial groups, competitiveness clusters, research and higher education players, laboratories of excellence, technological research institutes, etc.

 

Mission confiée

For availability and performance reasons, data is replicated. Several users have to be able to update concurrently the replicas of the same data without losing their modifications. Hive solution relies on IPFS (https://ipfs.io/) and mutable data support is offered by means of the mutable file system API of IPFS. However, there is no support for merging concurrent changes.

Data replication also raises the question about replicas placement. Depending on the nature and use of the data, placement is a critical issue. Many data-based applications performances are improved when data is local. Data locality can be approached in two ways: either considering the placement of data as input and then try to place tasks accordingly, or trying to place data to improve the future jobs based on this data. Replica placement also has its importance when considering data consistency. In highly distributed environments, maintaining consistency between data has a cost which depends on the distance between replicas.

In this Ph.D. thesis, we plan to propose a replication mechanism over sharded encrypted data that merges concurrent changes and that optimizes the cost of this merging by a suitable replica placement. We propose using CRDTs (Conflict-free Replicated Data Types) [2, 3] as replication mechanism as they are suitable for end-to-end encryption in a peer-to-peer environment where data will be decrypted only at the receiver side and conflicts can be resolved locally. There is therefore no need to decrypt data during data transmission as it is the case for centralised architectures where servers require un-encrypted data in order to perform merging. The challenge in this approach is to develop CRDTs on sharded data stored on IPFS.

References:

[1] J. Benet. “IPFS - Content Addressed, Versioned, P2P File System”. In: CoRR abs/1407.3561 (2014). doi: 10.48550/arXiv.1407.3561. arXiv: 1407.3561.

[2] M. Shapiro, N. M. Preguiça, C. Baquero, and M. Zawirski. “Conflict-Free Replicated Data Types”. In: 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems, SSS 2011. Oct. 2011, pp. 386–400. doi: 10.1007/978-3-642-24550-3_29.

[3]  L. André, S. Martin, G. Oster, and C.-L. Ignat. “Supporting adaptable granularity of changes for massive-scale collaborative editing”. In: Proceedings of the International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2013). Austin, Texas, USA, Oct. 2013

 

Principales activités

The two main replication mechanisms based on CRDTs, operation-based and state-based, have both advantages and disadvantages. On the one hand, operation-based synchronisation is more efficient since one only needs to ship small updates, but it requires exactly once causally-ordered broadcast. On the other hand, state-based synchronisation requires only reliable broadcast, but it induces communication overhead by shipping the whole state. The first step would be the study of these replication mechanisms and the evaluation of their suitability for the context of HIVE.

The next step would be to study the feasibility of the combination of replica placement algorithms with replication mechanisms based on CRDTs. The combined replica placement algorithms with consistency maintenance algorithms based on CRDTs has to be evaluated in terms of time and space complexity and network communication costs.

Avantages

  • Restauration subventionnée
  • Transports publics remboursés partiellement
  • Congés: 7 semaines de congés annuels + 10 jours de RTT (base temps plein) + possibilité d'autorisations d'absence exceptionnelle (ex : enfants malades, déménagement)
  • Possibilité de télétravail (après 6 mois d'ancienneté) et aménagement du temps de travail
  • Équipements professionnels à disposition (visioconférence, prêts de matériels informatiques, etc.)
  • Prestations sociales, culturelles et sportives (Association de gestion des œuvres sociales d'Inria)
  • Accès à la formation professionnelle
  • Sécurité sociale

Rémunération

1982,00€ brut mensuel les deux premières années (1594,00€ net)

2085,00€ brut mensuel la 3ème année (1677,00€ net)

 

1982,00€ gross monthly for the first two years (1594,00€ net)

2085,00€ gross monthly the 3rd year (1677,00€ net)