Supplementary Web site

Gene clustering via integrated Markov models combining individual and pairwise features

Submitted to IEEE Transactions on Computational Biology and Bioinformatics

Matthieu Vignes and Florence Forbes.

This paper proposes a gene clustering algorithm that aims at overcoming a very common limitation which is to assume that genes are independent. It is based on a Hidden Markov Random Field model including individual features as well as interaction data.
The paper also provides experiments both on simulated and real data (expression data from Chu et al, 1998 combined with KEGG).

Here you will find a draft version of the paper: here (.pdf) and the BibTeX reference (.bib).

And some supplementary material for the paper...

We also provide the program written in C++ used for the analysis and developped within the team: 
SpaCEM3
(or see mistis page, "realisation" section to be sure to download the latest version). The previous version of the algorithm (used for some of the experiments) can be found here (~4.5Mo). Installation and running procedure are very similar. Only a few function names changed. 

Here are some files on the data : 

  • raw data: original data file (text file, 2.5Mo), known sporulation temporal classes (.zip file), input data for the software (zip file that contains .dat, .nei and .str files as well as total list of genes considered and a mapping file between ORF names and numerical identifiers. Note that log2 and normalized file according to columns was used and .dat, .nei and .str files must have the same name in order to use the algorithm) and clustering results: Simulated Field and standard EM (text files, Eisen et al's .kgg format).
  • GO terms results: here (.zip file, 1.4Mo). Here are gathered all analysis on Simulated Field and standard EM clusters. P-values are computed taking into account multiple testing and dependencies between categories. The number of GO terms in the total set of genes is also provided (1935 unique GO terms are considered and 1016 when we only deal with biological processes at a level of at last 3 in hierarchy).

The authors would like to thank the anonymous reviewers for their useful comments and suggestions. We are also grateful to Frédéric Boyer and Juliette Blanchet for their help with the data and experiments and to Alain Viari and Éric Coissac (Helix, Inria Rhône-Alpes) for fruitful discussions.

Any feedback on this work is welcome.

Usefull links

Another list

Contact the authors :