Master Internship - Revisiting PCA with norm-ratio sparsity penalties
Type de contrat : Stage
Niveau de diplôme exigé : Bac + 4 ou équivalent
Fonction : Stagiaire de la recherche
Niveau d'expérience souhaité : Jeune diplômé
A propos du centre ou de la direction fonctionnelle
The Inria Saclay-Île-de-France Research Centre was established in 2008. It has developed as part of the Saclay site in partnership with Paris-Saclay University and with the Institut Polytechnique de Paris .
The centre has 40 project teams , 32 of which operate jointly with Paris-Saclay University and the Institut Polytechnique de Paris; Its activities occupy over 600 people, scientists and research and innovation support staff, including 44 different nationalities.
Contexte et atouts du poste
In the context of the ERC MAJORIS, and in collaboration with IFPEN company, the aim of this internship is to investigate the problem of sparse principal component analysis (PCA), with norm-ratio sparsifying penalties.
Subject:
Principal component analysis (PCA) is a workhorse in linear dimensionality reduction [Jol02]. It is widely applied in exploratory data analysis, visualization, data preprocessing).
Principal components are usually linear combinations of all input variables. For high-dimension data, this may involve input variables that contribute very little to the understanding. Finding the few directions in space that explain best observations is desirable. Sparse PCA overcomes this disadvantage by finding linear combinations that contain just a few input variables, by adding sparsity constraints [CR24,ZX18]. One of such is formulated (cf. lasso) with the help of an absolute norm penalty/regularization. In [MBPS10], one designs this matrix factorization problem as:
minimize_{\alpha} || X - D \alpha ||^2_F + \lambda|| \alpha ||_{1,1}
where: X = [x_1,...,x_n] is the matrix of data vectors; D is a square matrix from a suitable basis set, ||.||_F denotes the Frobenius norm; ||.||_{1,1} denotes the sum of the magnitude of matrix coefficients, \lambda is a positive penalty weight.
A penalty such as ||.||_{1,1} is 1-homogeneous. This may only weakly emulate the sheer count of non-zero entries of a matrix, that would be scale-invariant or 0-homogeneous.
Recently, the SOOT/SPOQ family of penalties has been developed in our research group, as smooth emulations to the scale-invariant lp/lq norm ratios. The latter had been used for a while, as stopping-criteria, penalties or ``continuous'' sparsity count estimators [HR09]. They have been used successfully for the restoration/deconvolution/source separation of sparse signals [CCDP20,RPD+15].
The goal of the internship is to investigate the resolution of sparse PCA models, by replacing the standard l1 norm by such norm ratios. Convergence analysis of the proposed optimization algorithm, imlementation and validation over public benchmarks will be conducted.
[CCDP20] Afef Cherni, Emilie Chouzenoux, Laurent Duval, and Jean-Christophe Pesquet. SPOQ ℓp-over-ℓq regularization for sparse signal
recovery applied to mass spectrometry. IEEE Trans. Signal Process., 68:6070–6084, 2020.
[CR24] Fan Chen and Karl Rohe. A new basis for sparse principal component analysis. J. Comp. Graph. Stat.), 33(2):421–434, 2024.
[HR09] N. Hurley and S. Rickard. Comparing measures of sparsity. IEEE Trans. Inform. Theory, 55(10):4723–4741, Oct. 2009.
[Jol02] I. T. Jolliffe. Principal component analysis. Springer Series in Statistics, 2nd edition, 2002.
[MBPS10] Julien Mairal, Francis Bach, Jean Ponce, and Guillermo Sapiro. Online learning for matrix factorization and sparse coding. J. Mach.
Learn. Res., 11:19–60, 2010.
[RPD+15] A. Repetti, M. Q. Pham, L. Duval, E. Chouzenoux, and J.-C. Pesquet. Euclid in a taxicab: Sparse blind deconvolution with smoothed
ℓ1/ℓ2 regularization. IEEE Signal Process. Lett., 22(5):539–543, May 2015.
[ZCD23] Paul Zheng, Emilie Chouzenoux, and Laurent Duval. PENDANTSS: PEnalized Norm-ratios Disentangling Additive Noise, Trend
and Sparse Spikes. IEEE Signal Process. Lett., 30:215–219, 2023.
[ZX18] Hui Zou and Lingzhou Xue. A selective overview of sparse principal component analysis. Proc. IEEE, 106(8):1311–1320, August
2018.
Mission confiée
Missions: The goal of this subject is to:
• investigate potential derivations using SOOT/SPOT penalties,
• implement the algorithmic work-flow in a scientific toolkit (eg scikit-learn),
• benchmark it against competing methods.
Environment: The intern will be supervised by Emilie Chouzenoux (Head of OPIS team, Inria Saclay) and Laurent Duval (Research Engineer, IFPEN, Rueil Malmaison). The intern student will join the Inria Saclay team OPIS (https://opis-inria.eu/). He/she will be located in the Centre de la Vision Numérique, in CentraleSupélec campus, Saclay, France. He/she will enjoy an international and creative environment where research seminars and reading groups take place very often. Informatic material expenses will be covered within the limits of the scale in force.
Organization: The proposed offer is dedicated to internship of Master 1 / Master 2 / Engineering students. The starting/end dates are flexible, with a minimum duration of 4 months.
Principales activités
Main activities :
Bibliographical study
Programming in Python environment
Benchmark on public datasets
Scientific meetings
Writing of scientific reports
Compétences
Languages : The candidate must be fluent in english and/or french languages.
Avantages
- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking and flexible organization of working hours
- Professional equipment available (videoconferencing, loan of computer equipment, etc.)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage
Rémunération
Gratification
Informations générales
- Thème/Domaine :
Optimisation, apprentissage et méthodes statistiques
Statistiques (Big data) (BAP E) - Ville : Gif sur Yvette
- Centre Inria : Centre Inria de Saclay
- Date de prise de fonction souhaitée : 2025-04-01
- Durée de contrat : 5 mois
- Date limite pour postuler : 2025-03-31
Attention: Les candidatures doivent être déposées en ligne sur le site Inria. Le traitement des candidatures adressées par d'autres canaux n'est pas garanti.
Consignes pour postuler
Sécurité défense :
Ce poste est susceptible d’être affecté dans une zone à régime restrictif (ZRR), telle que définie dans le décret n°2011-1425 relatif à la protection du potentiel scientifique et technique de la nation (PPST). L’autorisation d’accès à une zone est délivrée par le chef d’établissement, après avis ministériel favorable, tel que défini dans l’arrêté du 03 juillet 2012, relatif à la PPST. Un avis ministériel défavorable pour un poste affecté dans une ZRR aurait pour conséquence l’annulation du recrutement.
Politique de recrutement :
Dans le cadre de sa politique diversité, tous les postes Inria sont accessibles aux personnes en situation de handicap.
Contacts
- Équipe Inria : OPIS
-
Recruteur :
Chouzenoux Emilie / emilie.chouzenoux@inria.fr
L'essentiel pour réussir
We seek for a talented candidate in Master 1, Master 2, or Engineering studies, with a solid background in optimization, and signal processing, and a strong motivation for research and innovation. Experience in Python is necessary.
The candidates are requested to send a CV and a motivation letter to apply for this position.
A propos d'Inria
Inria est l’institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l’interface d’autres disciplines. L’institut fait appel à de nombreux talents dans plus d’une quarantaine de métiers différents. 900 personnels d’appui à la recherche et à l’innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'efforce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.