PhD Position F/M Misinformation trajectories: detecting and tracing disinformation across heterogeneous data sources

Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Doctorant

A propos du centre ou de la direction fonctionnelle

The Inria Saclay-Île-de-France Research Centre was established in 2008. It has developed as part of the Saclay site in partnership with Paris-Saclay University and with the Institut Polytechnique de Paris since 2021.

The centre has 39 project teams , 27 of which operate jointly with Paris-Saclay University and the Institut Polytechnique de Paris. Its activities occupy over 600 scientists and research and innovation support staff, including 54 different nationalities.

Contexte et atouts du poste

With the advent of consumption and dissemination of information online, it becomes important to verify factual information available.  In particular, public figures, like politicians and elected officials, make public statements that have far-reaching implications. Thus, the task of fact checking becomes critical in order to contain fake news or misinformation.

To add to the inherently difficult task of fact-checking, the false claims often resurface after some temporal gaps.

The reports from the fact-checker have confirmed a lot of effort is wasted in re-reviewing previously fact-checked claims. Often different linguistics transformations like paraphrasing, introducing sarcasm, and etc are used for reformulating the same claim after some time gap. In this work, the student will work on identifying, modeling, and detecting these transformations from linguistic as well as computational perspectives. We intend to develop a framework that uses NLP techniques to identify the same claims resurfacing over and over again and further, model and compute the provenance of the claims capturing information about the transformations the claim has gone through.  To work on this topic, we seek PhD candidates with strong backgrounds in computer science, NLP, and data management.

The PhD student will be part of the vibrant CEDAR team and will conduct her research in the extremely collaborative  research environment of the team; she will be supervised by Prof. Ioana Manolescu ( senior INRIA researcher,  part-time professor at Ecole Polytechnique), Prof. Oana Balalau (Inria researcher and part-time assistant professor at Ecole Polytechnique) and Garima Gaur (Postdoc in CEDAR)

Mission confiée

The student will start the project by getting familiar with the tools and techniques used for data acquisition and data cleaning mainly in the context of fact checks and social media posts. In parallel, the student is expected to get familiar with core concepts of natural language processing, data provenance, and data modeling. In addition to the basics, the student will conduct a literature review to get familiar with the existing work on claim retrieval and claim provenance modeling. Eventually, the student will work towards designing and developing a complete approach for streaming content analysis which, on one hand, given an input claim, identifies past fact-checks (FCs, in short), and on the other hand, tracks the evolution and origin of claims, across media content of many forms (social media messages short like tweets or longer such as Facebook posts, blogs, news-propagating web sites, etc.)

For more details about the proposed research topic, candidates can refer to this document.  Feel free to write to us for any queries or further clarification about the work.

Principales activités

The core activities of the PhD student encompass:

  • Reviewing literature — reading and discussing papers
  • Devising, implementing, and evaluating solutions proposed in a collaborative research environment
  • Writing and presenting work in scientific venues.

Compétences

Technical skills:

  • At least base level understanding of NLP and data modeling concepts
  • Coding experience with ML libraries (preferably in Python)

Other skills:

  • Experience with reading and comprehending technical papers.
  • Proficiency in spoken and written English; similar proficiency in French is a plus.

 

Avantages

  • Subsidized meals
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training

Rémunération

1st et 2nd year : 2100€ gross/month

3rd année : 2158€ gross/month