Accueil Membres Docteurs Publications Projets Réalisations Séminaires Offres d'emplois Intranet

Job offers (2017)

PhD thesis


Semiparametric Mixture Models


INRIA team Mistis (, Grenoble, headed by Florence Forbes (florence [dot] forbes [at] inria [dot] fr). Supervisor: Gildas Mazo (gildas [dot] mazo [at] inria [dot] fr). Co-supervisors: Florence Forbes and St├ęphane Girard. Starting date: Fall 2017. Salary: typically 1350--1650 euros per month net (depends on available funding), can be increased through teaching


A mixture model is a weighted sum of probability density functions. It can serve as a model for representing a population with different sub-groups or for clustering data-points. Applications of mixture models are numerous, as, for instance, high-dimensional clustering of vibrational spectroscopy data in chemometrics or detection of tumors in brain images. Many mixture models exist in the literature. But very few of them are robust with respect to a wide range of applications. That is, very few models perform well for many data-sets. For instance, some data have extreme values while some others do not. According to the considered data, some models perform well and some others do not. There are, however, models which can adapt automatically to the targeted application, as, for instance, semi-parametric models. But semi-parametric models need to assume a conditional independence assumption. This assumption is too restrictive in practice and prevent them to be used in most applications. The goal of the PhD is to build semiparametric mixture models which do not appeal to this conditional independence assumption. In order to achieve this goal, one may want to use the concept of copulas. Copulas are a tool which permits to incorporate a given dependence structure. After a first step involving the construction of the models, a second step would involve inference, probably with EM-like algorithms. In particular, the properties of the algorithms would have to be investigated. A final step would involve implementation and application to real data in order to demonstrate the usefulness of the methods.


We look for candidates strongly motivated by challenging research topics. The applicant must have an excellent background in statistics and good programming skills. A knowledge of the following methods will be helpful: nonparametric density estimation, EM algorithms, copulas, mixture models. Good skills in at least R or python is required.


. McLachlan, G. and Peel, D. Finite mixture models. Wiley, 2004.
. Kosmidis, I. and Karlis, D. Model-based clustering using copulas with applications. Statistics and Computing, 1-21, 2015.
. Mazo, G. A semiparametric and location-shift copula-based mixture model. Journal of Classification, accepted for publication.
. Banfield, J. D. and Raftery, A. E. Model-based Gaussian and non-Gaussian clustering. Biometrics, 803-821, 1993.
. Forbes, F. and Wraith, D. A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering. Statistics and Computing, 24(6):971--984, 2014.
. Lee, S. and McLachlan, G. J. Finite mixtures of multivariate skew t-distributions: some recent and new results. Statistics and Computing, 24(2):181--202, 2014.
. Jacques, J. and Bouveyron, C. and Girard, S. and Devos, O. and Duponchel, L. and Ruckebusch, C. Gaussian mixture models for the classification of high-dimensional vibrational spectroscopy data. Journal of Chemometrics, 24(11-12):719--727, 2010.
. Marbac, M. and Biernacki, C. and Vandewalle, V. Model-based clustering of Gaussian copulas for mixed data. arxiv preprint arXiv:1405.1299, 2014.


semi-parametric; non-parametric; mixture model; copula; EM algorithm; clustering

Post-doc and research engineer proposals

-Python software developer position in functional neuroimaging. [proposal in English (pdf)].[ en francais (pdf)].

-3 positions in statistical learning : post-doc and engineer. [proposal in English (pdf)].[ en francais (pdf)].