Home page Members Publications News

Welcome on the SIMERG(2)E webpage ...

Statistical Inference for the Management of Extreme Risks, Genetics and Global Epidemiology

SIMERGE (Statistical Inference for the Management of Extreme Risks and Global Epidemiology)

SIMERGE is a LIRIMA project-team started in January 2015. It includes researchers from Mistis (Inria Grenoble - Rhône-Alpes, France), LERSTAD (Laboratoire d'Etudes et de Recherches en Statistiques et Développement, Université Gaston Berger, Sénégal), IRD (Institut de Recherche pour le Développement, Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes, Dakar, Sénégal) and LEM lab (Lille Economie et Management, Université Lille 1, 2, 3, Modal, Inria Lille Nord-Europe, France).

SIMERG2E (Statistical Inference for the Management of Extreme Risks, Genetics and Global Epidemiology)

In January 2018, SIMERGE was extended to SIMERG2E. This Associate team is built on the same two research themes as SIMERGE, with some adaptations to new applications. The Institut Pasteur de Dakar joined the team. The Associate team is built on two research themes:

1. Spatial extremes, application to management of extreme risks

Weather variability, both in terms of space and time, is of prime importance in many hydrological, agricultural and energy contexts. Therefore, spatio-temporal modelling of environmental data is well studied in the literature. The basic objectives are: (i) to infer the nature of spatial variation of extreme precipitations and temperatures based on meteorological observations and (ii) to model the pattern of variability of these data components. Different characterizations of multivariate extreme dependence structures have been proposed in the literature (see, for instance, Coles et al (2000), Ledford and Tawn (1996)). These works were the basis of recent studies to characterize the dependence between extremes of a spatial process, see for instance Huser et al (2017) or Wadsworth et al (2017). Once the modeling step is achieved, the inference of the associated risk can be tackled. One of the most popular risk measures is the Value-at-Risk (VaR) introduced in the 1990's. In statistical terms, the VaR at level alpha in (0, 1) corresponds to the upper alpha-quantile of the loss distribution. Even though the VaR has been introduced to deal with financial risks, it is also of interest in meteorological applications where it is interpreted as a return level. The Value-at-Risk however suffers from two main weaknesses. First, it provides us only with a pointwise information: VaR(alpha) does not take into consideration what the loss will be beyond this quantile. Second, random loss variables with light-tailed distributions or heavy- tailed distributions may have the same Value-at-Risk (Embrechts et al, 1999). Consequently, the definition of new risk measures, the study of their properties in case of extreme events, i.e. when alpha tends to zero and their estimation from data are three major statistical challenges (Bellini and Di Bernardino (2017)).

2. Classification, application to genetics and global epidemiology

We address the challenge to build statistical models in order to test association between diseases and human host genetics in a context of genome-wide screening. Adequate models should allow to handle com- plexity in genomic data (e.g. linkage disequilibrium or correlation between genetic markers, high dimensionality) and additional statistical issues present in data collected from a family-based longitudinal survey (e.g. non-independence between individuals due to familial relationship (kinship) and non-independence within individuals due to repeated measurements on a same person over time). Our genomic data consist of genotypes on 719,656 SNPs (Single Nucleotide Polymorphism) typed on 481 individuals in Senegal, in rural area where malaria and arboviral diseases are endemic. These SPNs data can be considered as categorical variables in high dimension (p = 719, 656 and n = 481). New unsupervised classification methods and co-clustering approaches will be proposed to classify individuals according the different disease status. Indeed, the situation p >> n is an obstacle to most statistical methods and, moreover, individuals may not be independent due to their parental links. This phenomenon further reduces the number of independent observations.



- Mistis, Inria Grenoble Rhône-Alpes

- Modal, Inria Lille Europe

Contact the members

- Head: Stéphane Girard

- Co-head: Abdou Kâ Diongue