Statistical analysis of microarray data - Links

Software

Program URL Platforms Distribution Category Description
R http://www.r-project.org/ Unix, Macintosh, Windows Open source Statistics One of the most famous statistical packages. Many librairies, including specific ones for the analysis of microarray data.
Statistics for Microarray Analysis http://www.stat.berkeley.edu/users/terry/zarray/Software/smacode.html R package Open source Microarray analysis
YASMA http://people.cryst.bbk.ac.uk/wernisch/yasma.html
Bioconductor http://www.bioconductor.org/ R package Open source Biostatistics An open source and open development software project for the analysis and comprehension of genomic data.
TIGR Microarray suite TM4 http://www.tigr.org/software/tm4/ Java or C++ for Windows Open source Microarrays The TM4 suite of tools consist of four major applications, Microarray Data Manager (MADAM), TIGR_Spotfinder, Microarray Data Analysis System(MIDAS), and Multiexperiment Viewer(MeV)
TIGR MIDAS http://www.tigr.org/software/tm4/midas.html Java Open source Normalization TIGR Microarray Data Analysis System (MIDAS) is a microarray data quality filtering and normalization tool that allows raw experimental data to be processed through various data normalizations, filters, and transformations via a user-designed analysis pipeline. Currently implemented normalization and data analysis algorithms include total-intensity normalization, Lowess (Locfit) normalization, flip-dye consistency checking, replicates analysis, intensity-dependent z-score filtering (slice analysis), etc.
TIGR MeV http://www.tigr.org/software/tm4/mev.html Java Open source Clustering
Visualization
TIGR MultiExperiment Viewer (MEV) is a Java application designed to allow the analysis of microarray data to identify patterns of gene expression and differentially expressed genes. Numerous normalization, clustering and distance algorithms have been implemented, along with a variety of graphical displays to best present the results.
http://www.tigr.org/software/tm4/utilities.html Windows Freeware Converter utility ExpressConverter is a file format transformation tool that reads GenePix file as input and generates output as TIGR ArrayView file (.tav) or TIGR MultiExperiment Viewer file (.mev) so that the microarray data can be uploaded to databases with MADAM and analyzed with MIDAS and MEV.
BRB ArrayTools http://linus.nci.nih.gov/BRB-ArrayTools.html Windows + Excel Freeware BRB ArrayTools is an integrated package for the visualization and statistical analysis of DNA microarray gene expression data. It was developed by professional statisticians experienced in the analysis of microarray data and involved in the development of improved methods for the design and analysis of microarray based experiments. The array tools package utilizes an Excel front end. Scientists are familiar with Excel and utilizing Excel as the front end makes the system portable and not tied to any database. The input data is assumed to be in the form of Excel spreadsheets describing the expression values and a spreadsheet providing user-specified phenotypes for the samples arrayed. The analytic and visualization tools are integrated into Excel as an add-in. The analytic and visualization tools themselves are developed in the powerful R statistical system, in C and Fortran programs and in Java applications. Visual Basic for Applications is the glue that integrates the components and hides the complexity of the analytic methods from the user. The system incorporates a variety of powerful analytic and visualization tools developed specifically for microarray data analysis.
ViDaExpert http://www.ihes.fr/~zinovyev/vida/vidaexpert.htm Windows Freeware Vizualisation Software tool for visualization of multidimensional datasets. It allows to make understandable color illustrations of a dataset to explore it’s intrinsic patterns and regularities. The main technique implemented in ViDaExpert is the Method of Elastic Maps – advanced analogue of the Method of Self-Organizing Maps. Besides, there are many other methods of data analysis, including Principal Components Analysis, different clustering methods, Linear Discriminate Analysis, Linear Regression Method.

Web software

Program URL Description
BASE http://base.thep.lu.se
SNOMAD - Standardization and NOrmalization of MicroArray Data http://pevsnerlab.kennedykrieger.org/snomadinput.html SNOMAD consists of a collection of algorithms directed at the normalization and standardization of DNA microarray data. The majority of the transformations within SNOMAD are directed at the refinement of paired microarray data ...
INCLUSive: A web portal and service registry for microarray and regulatory sequence analysis http://www.esat.kuleuven.ac.be/inclusive INCLUSive is a suite of algorithms and tools for the analysis of gene expression data and the discovery of cis-regulatory sequence elements. The tools allow normalization, filtering and clustering of microarray data, functional scoring of gene clusters, sequence retrieval, and detection of known and unknown regulatory elements using probabilistic sequence models and Gibbs sampling.
GEPAS: A web-based resource for microarray gene expression data analysis. http://www.gepas.org/ GEPAS is composed of different interconnected modules which include tools for data pre-processing, two-conditions comparison, unsupervised and supervised clustering (which include some of the most popular methods as well as home made algorithms) and several tests for differential gene expression among different classes, continuous variables or survival analysis. A multiple purpose tool for data mining, based on Gene Ontology, is also linked to the tools, which constitutes a very convenient way of analysing clustering results.
GEDA http://bioinformatics.upmc.edu/GE2/GEDA.html GEDA provides a large number of options for transformation and normalization as well as a diversity of tests for finding differentially expressed genes.ÊÊ All of the tests can be performed in combination with a variety of clustering algorithms for class prediction. Extensive data quality metrics and graphical outputs are included in the output, including M-A plots, mean vs. variance plots, mean group correlation plots, box and whisker plots Êa gene expression pattern grid.ÊÊ Computational valdiation options include cross-fold validation, leave-one-out validation, and bootstrapping.ÊÊ A variety of published cancer microarray data sets are available for analysis on-tap.
GenePublisher: Automated analysis of DNA microarray data. http://www.cbs.dtu.dk/services/GenePublisher The server performs normalization, statistical analysis and visualization of the data. The results are run against databases of signal transduction pathways, metabolic pathways and promoter sequences in order to extract more information.
ExpressYourself: A modular platform for processing and visualizing microarray data. http://bioinfo.mbb.yale.edu/expressyourself In completely automated fashion, it will correct the background array signal, normalize the Cy5 and Cy3 signals, score levels of differential hybridization, combine the results of replicate experiments, filter problematic regions of the array and assess the quality of individual and replicate experiments. ExpressYourself is designed with a highly modular architecture so various types of microarray analysis algorithms can readily be incorporated as they are developed; for example, the system currently implements several normalization methods, including those that simultaneously consider signal intensity and slide location.
REDUCE: An online tool for inferring cis-regulatory elements and transcriptional module activities from microarray data. http://bussemaker.bio.columbia.edu/reduce/ REDUCE is a motif-based regression method for microarray analysis. The only required inputs are (i) a single genome-wide set of absolute or relative mRNA abundances and (ii) the DNA sequence of the regulatory region associated with each gene that is probed. Currently supported organisms are yeast, worm and fly; it is an open question whether in its current incarnation our approach can be used for mouse or human. REDUCE uses unbiased statistics to identify oligonucleotide motifs whose occurrence in the regulatory region of a gene correlates with the level of mRNA expression. Regression analysis is used to infer the activity of the transcriptional module associated with each motif.
ChipInfo: Software for extracting gene annotation and gene ontology information for microarray analysis. http://biosun1.harvard.edu/complab/chipinfo/ To date, assembling comprehensive annotation information for all probe sets of any Affymetrix microarrays remains a time-consuming, error-prone and challenging task. ChipInfo is designed for retrieving annotation information from online databases such as NetAffx and Gene Ontology and organizing such information into easily interpretable tabular format outputs. As companion software to dChip and GoSurfer, ChipInfo enables users to independently update the information resource files of these software packages. It also has functions for computing related summary statistics of probe sets and Gene Ontology terms.
Regulatory sequence analysis tools. http://rsat.ulb.ac.be/rsat/ A collection of software tools dedicated to the prediction of regulatory sites in non-coding DNA sequences. These tools include sequence retrieval, pattern discovery, pattern matching, genome-scale pattern matching, feature-map drawing, random sequence generation and other utilities. Alternative formats are supported for the representation of regulatory motifs (strings or position-specific scoring matrices) and several algorithms are proposed for pattern discovery.

Databases

Datasets

Name URL Description
Stanford Microarray Database http://genome-www5.stanford.edu/MicroArray/SMD/ SMD stores raw and normalized data from microarray experiments, as well as their corresponding image files. In addition, SMD provides interfaces for data retrieval, analysis and visualization.
Stanford complete cell cycle dataset http://genome-www.stanford.edu/cellcycle/data/rawdata/
Princeton University Gene Expression Project http://microarray.princeton.edu/oncology/ Cancer versus normal tissues.
TIGR Human Cancer Microarray Research http://cancer.tigr.org/c_data.shtml
Golub dataset http://www-genome.wi.mit.edu/mpr/data_set_ALL_AML.html This dataset is automatically installed with Bioconductor.
Normalization Dataset used in Yang et al. (2002). Nucleic Acids Res 30(4), e15.

Additional information

Name URL Description
Groupe de travail Biopuces http://www.lirmm.fr/~brehelin/DonneesBiopuces/ Vous trouverez ici quelques références sur des jeux de données publics ainsi que les problèmes d'apprentissage qui leur sont associés, les données choisies comme jeux de référence par les membres du GT, le calendrier des réunions passées et à venir, ainsi qu'un forum de discussion créé pour le groupe.
Microarray Software Comparison - R packages for microarray analysis http://ihome.cuhk.edu.hk/~b400559/arraysoft_rpackages.html
Outils pour l'analyse des données de Biopuces http://www.genopole-lille.fr/fr/logiciel/microarray/norm_tools.html

Jacques van Helden (jvanheld@bigre.ulb.ac.be)