Assessing gene significance from cDNA microarray expression data via mixed models

J Comput Biol. 2001;8(6):625-37. doi: 10.1089/106652701753307520.

Abstract

The determination of a list of differentially expressed genes is a basic objective in many cDNA microarray experiments. We present a statistical approach that allows direct control over the percentage of false positives in such a list and, under certain reasonable assumptions, improves on existing methods with respect to the percentage of false negatives. The method accommodates a wide variety of experimental designs and can simultaneously assess significant differences between multiple types of biological samples. Two interconnected mixed linear models are central to the method and provide a flexible means to properly account for variability both across and within genes. The mixed model also provides a convenient framework for evaluating the statistical power of any particular experimental design and thus enables a researcher to a priori select an appropriate number of replicates. We also suggest some basic graphics for visualizing lists of significant genes. Analyses of published experiments studying human cancer and yeast cells illustrate the results.

MeSH terms

  • Computational Biology
  • Gene Expression Profiling / statistics & numerical data*
  • Genes, Fungal
  • Humans
  • Lymphoma, B-Cell / genetics
  • Models, Genetic
  • Models, Statistical*
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*
  • Saccharomyces cerevisiae / genetics