This PhD thesis will be supervised by Claudia-Lavinia Ignat, researcher at Inria center of Lorraine University and co-supervised by Léo Joubert, assistant professor at Université de Rouen Normandie.


Large-scale collaborative systems, where a large number of users collaborate to carry out a shared task, are attracting much attention from industry and academia. CSCW studies [1,2] showed that the awareness of behavior of other members of the team is an important component to compensate for the lack of direct communication. By allowing each member to be aware of what other members are doing, trust can be built in the team [3]. Trust is defined as an individual’s willingness to become vulnerable to the actions of others with the expectation that others will follow through on their commitments [4]. Trust is more crucial in open-collaborative systems such as Wikipedia in which members usually do not know each other personally. However, it is difficult for end users to manually assess the level of trust in each partner, that is the credibility value that a user can attribute to another user based on their past interactions. This thesis aims to study the problem of trust evaluation and seeks to design a computational trust model dedicated to collaborative systems.

We are particularly interested in the case of Wikipedia, a collaborative online encyclopedia, because it provides us with a huge database produced by a large number of contributors.

On the platform, users can submit revisions of articles to improve their content. The objective of Wikipedia is to ensure the quality and neutrality of the platform's documents.

We already studied how the collaborative interaction of one user affects the trust assessed by the other users in the trust game [5] and contract-based multi-synchronous collaboration [6]. In the trust game [7] the interaction consisted of the money transaction between the two users, while in contract-based multi-synchronous collaboration the computation of trust was based on the adherence to/violation of contracts shared between two users. In the context of the trust game we also showed (i) that presenting a trust score to users encourages collaboration between them in a meaningful way, at a similar level to displaying participants' nicknames; (ii) that users conform to the confidence score in their decision-making regarding monetary exchange [8]. The results therefore suggest that a trust model can be deployed in collaborative systems in order to assist users. However, in Wikipedia, users do not interact directly, but by means of the article to which they contribute. It is difficult to figure out how one user’s edits might influence another user’s edits. 

Usually, scientific literature considers the quality of a contribution in relation to its lifetime on a page. The longer the content of the contribution is present, the higher its quality. The problem with this measure is that it excludes from the quality judgment both the mutual trust that contributors may have with each other, and the fact that Wikipedia rules justifying the deletion of contributions may apply differently from one page to another.

To advance towards this issue, we want to calculate a Wikipedia user's trust level in relation to their past contributions, this trust level being able to predict the quality of this user's future contributions. The trust metric proposed in [5, 6] to predict the behavior of users in relation to their past interactions and taking into account fluctuations in user behavior could be applied by considering that interactions between users are the user contributions to revisions of Wikipedia articles. The main challenge is to define the quality of a user's contributions. For this we plan to study existing metrics based on the length of contributions (for example the length of a contribution in terms of the number of characters added) and the longevity of contributions (edit longevity, for example the duration of persistence of a contribution in the article).

Our concept relies on the use of a distance (for example the Levenstein distance) between the different versions of the document. We would like to calculate a measure of longevity based on a semantic distance by using BERT [9, 11] and SMART [10] models and compare it with existing measures. Wikipedia provides a dataset containing articles that have been manually assessed for quality by experts [12][13]. We therefore wish to validate our algorithms for measuring the quality of user contributions on this data.

In addition to the analysis of the quality of user edits, we plan to analyse user interactions on talk pages which will provide an additional measure for the trust between users.

One of the gaps that will be filled by this project will be to consider the legitimacy of the Wikipedia rules when measuring the level of trust contributors place in a peer, in the context of a page. Indeed, Wikipedia rules are widely used by contributors to settle disagreements on the collaborative writing of a page [14]. What's more, the legitimacy of the rules influences whether individual trajectories are deployed [15]. To take this into account when specifying the trust game, we might introduce parameters linked to the global state of the wiki and of a page.

We also aim to quantify the needed edits to profile a contributor. Whereas common statistical wisdom may recommend having a lot of contributions to stabilize a profile, some research has already stated that early edits of contributors can already be meaningful to state their profiles [16,17].


Main activities

  • Study the existing trust metrics in collaborative systems
  • Study existing works on article’s quality in Wikipedia and rules’ legitimacy
  • Propose a metric for the quality of user contributions based on the length and longevity of contributions (using both syntactic and semantic distances)
  • Adapt the trust metric proposed in [5] for Wikipedia considering that user interactions during trust game are their contributions for article revisions
  • Perform measurements using Wikipedia dataset


  • Engineering and/or Master 2 degree in Computer science / Applied mathematics / Cognitive science
  • Theoretical expertise: collaborative systems 

  • Good collaborative and networking skills, excellent written and oral communication in English
  • Good programming skills
  • Strong analytical skills

