Scientists from the ӳý have released GSEA 2.0, an enhanced version of the gene expression analysis tool that uses a multi-gene approach to extract accurate and meaningful information from DNA microarray data. In contrast to analytical methods based on single genes, the GSEA software detects changes in gene activity across the genome by relying on a public database of biologically defined "gene sets" — groups of genes that are connected based on their function, chromosomal location or molecular control. The updated software now includes more than 1,000 additional gene sets as well as a gene set browser, which can pinpoint specific gene sets based on a particular gene or other parameter of interest. These new features along with other enhancements improve and extend the capabilities of the GSEA software, a freely available tool for biomedical researchers.
"The strength and flexibility of GSEA stem from its use of gene sets, the majority of which are drawn from data contributed by the scientific community," said Aravind Subramanian, a scientist in the Cancer Program at the ӳý and a member of the GSEA development team. "By nearly doubling the number of gene sets in the software’s database, we have expanded the breadth and depth of its analytical capabilities."
Through its gene set database, GSEA (short for "Gene Set Enrichment Analysis") can evaluate the rank-ordered lists of genes that are produced by microarray analyses, determining if the genes of a particular set are dispersed randomly or, as in the case of more noteworthy findings, are distributed near the top or bottom. For GSEA 2.0, the scientists expanded the gene set database to include more than 3,000 gene sets, the majority of them collected from published papers. The researchers also developed a search engine for identifying specific gene sets based on various criteria, such as the genes contained in them or the organism from which they are derived.
Another key supplement to the desktop version of the GSEA software is the ability to graphically analyze the so-called “leading edge” of high-scoring gene sets. Within a highly ranked gene set, each gene may not necessarily earn a high score. The individual genes that do rank highly are largely responsible for the gene set’s high score and constitute the leading edge.
"An analysis of the ‘leading edge’ can reveal key subsets of genes within a larger gene set," said Pablo Tamayo, a computational biologist at the ӳý and a member of the GSEA development team. "Moreover, it can uncover genes that behave similarly yet are grouped within distinct gene sets, thereby suggesting a new and potentially important gene set."
"The power of the GSEA method lies in its ability to detect even subtle changes in gene expression — the molecular nuances that can underlie human disease," said Jill Mesirov, chief informatics officer and director of Computational Biology and Bioinformatics at the ӳý. "These improvements demonstrate our commitment to maintaining the software in a way that is both concurrent with the pace of biomedical research and adaptable to the needs of the scientific community."
Currently, there are over 2500 registered GSEA users from more than 300 research institutions worldwide.
The updated software and a comprehensive list of new features and fixes can be found at the GSEA website http://www.broadinstitute.org/gsea/.
About the ӳý of MIT and Harvard
The ӳý of MIT and Harvard was founded in 2003 to bring the power of genomics to biomedicine. It pursues this mission by empowering creative scientists to construct new and robust tools for genomic medicine, to make them accessible to the global scientific community, and to apply them to the understanding and treatment of disease.
The Institute is a research collaboration that involves faculty, professional staff and students from throughout the MIT and Harvard academic and medical communities. It is jointly governed by the two universities.
Organized around Scientific Programs and Scientific Platforms, the unique structure of the ӳý enables scientists to collaborate on transformative projects across many scientific and medical disciplines.