2018-00604 - PhD Position: Semantic segmentation with minimal supervision

Contract type : Public service fixed-term contract

Level of qualifications required : Graduate degree or equivalent

Fonction : PhD Position

Level of experience : Up to 3 years

About the research centre or Inria department

The Inria Rennes - Bretagne Atlantique Centre is one of Inria's eight centres and has more than thirty research teams. The Inria Center is a major and recognized player in the field of digital sciences. It is at the heart of a rich R&D and innovation ecosystem: highly innovative PMEs, large industrial groups, competitiveness clusters, research and higher education players, laboratories of excellence, technological research institute, etc.

Team presentation : https://www.inria.fr/en/teams/linkmedia

The challenge that multimedia faces today is that of context awareness, i.e., describing documents in the context in which they appear (context of a collection, social context, etc.). Following this line of thought, the seminal idea of LinkMedia is that of content-based media linking with the ultimate goal of enabling better multimedia applications and new innovative services. Taking a content-based perspective, we seek to create explicit links at different levels to better reflect the context: links at the signal level, e.g., with repeating patterns; links at a semantic level, e.g., to follow topics or stories; links at a paradigmatic level, e.g., to have further details or comments on a topic. LinkMedia investigates a number of key issues related to multimedia collections with explicit links: Can we discover what characterizes a collection and makes its coherence? Are there repeating motifs that create natural links and which deserve characterization and semantic interpretation? How to explicitly create links from pairwise distances? What structure should a linked collection have? How do we explain the semantic of a link? How explicit links can be used to improve information retrieval? To improve user experience? Addressing such questions, our goal is to lay down scientific foundations for collection structuring by means of explicit links and to study new usages and content processing techniques induced by structured context-aware collections.


The PhD will be supervised by Dr Miaojing Shi and Dr Yannis Avrithis. Work will be carried out within Inria team LinkMedia. The team specializes in multimedia content processing for analytics, gathering specialists from different fields: natural language processing, image processing and computer vision, data mining, databases.


The goal of this PhD is to study semantic segmentation in images or video with minimal supervision. This task will be placed into a setting where only image-level annotation is provided [KL16]. To begin, additional supervision such as clicks [BRF16], strokes [VC17], or bounding boxes [RPK17] may also be assumed. Towards the end of the PhD, the student is expected to work with datasets of mixed levels of supervision, including a harder, semi-supervised setting where there are only a few image-level labels as well as a large amount of unlabeled images.

Several ideas can be investigated in the context of deep learning. For instance, generative adversarial learning can be employed to either augment the dataset [SSS17] or bridge the predicted segmentations with their ground truth [LCC16]. Recurrent neural networks (RNN) can be applied to video segmentation in particular to localize and segment semantic parts across nearby frames [TAS17]. On unstructured image datasets, ideas like deep metric learning [FWR17] and random-walk label propagation [VC17] can be extended across pairs or groups of images. Cross-category transfer learning [XWL18] can be a further extension.  



semantic segmentation, minimal supervision, deep architectures, adversarial learning, recurrent networks, metric learning



[BRF16] A. Bearman, O. Russakovsky, V. Ferrari and F.-F. Li. What's the Point: Semantic Segmentation with Point Supervision. ECCV 2016.

[FWR17] A. Fathi, Z. Wojna, V. Rathod, P. Wang, H. Song, S. Guadarrama and K.P. Murphy.  Semantic Instance Segmentation via Deep Metric Learning. arXiv 2017.

[KL16]  A. Kolesnikov and C. H. Lampert. Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation. ECCV 2016.

[LCC16] P. Luc, C. Couprie, S. Chintala and J. Verbeek. Semantic Segmentation using Adversarial Networks. NIPS Workshop on Adversarial Training 2016.

[RPK17] R. Hu, P. Dollar, K. He, T. Darrell and R. Girshick. Learning to Segment Every Thing.  arXiv 2017.

[SSS17] N. Souly, C. Spampinato and M. Shah. Semi and Weakly Supervised Semantic Segmentation Using Generative Adversarial Network. arXiv 2017.

[TAS17] P. Tokmakov, K. Alahari and C. Schmid. Learning Video Object Segmentation with Visual Memory. ICCV 2017.

[VC17] P. Vernaza and M. Chandraker.  Learning random-walk label propagation for weakly-supervised semantic segmentation. CVPR 2017.

[XWL18] H. Xiao, Y. Wei, Y. Liu, M. Zhang and J. Feng. Transferable Semi-supervised Semantic Segmentation. AAAI 2018.

Main activities

Not applicable.


The candidate should ideally have a degree in Computer Science, Applied Mathematics or Electrical Engineering; solid mathematical background and programming skills; fluency in English language; preferably, prior experience in computer vision, machine learning or data mining.

Benefits package

  • Subsidised catering service
  • Partially-reimbursed public transport
  • Social security
  • Sports facilities


Gross salary : 2653 euros