Strength in numbers: New mathematical model shows promise for cancer genomics
By Nicole Davis, Communications
Some researchers work “by the numbers” to identify the genetic hallmarks of cancer. Now, a powerful mathematical tool introduced by ӳý scientists will facilitate this task, providing a clearer picture of how DNA runs amok in tumors.
The genetic distortions that lurk within the cells of a tumor form the driving force behind malignancy. These changes, which involve either gains or losses of DNA, perturb the usual number of gene copies in a cell and can involve either of the paired chromosomes. Scientists are trying to trace the chromosomal origins of such modifications to pinpoint informative genes, forming the basis for new therapeutic targets and possible genetic predictors for cancer diagnosis.
Researchers led by Matthew Meyerson, an associate member of the ӳý and associate professor at Harvard Medical School/Dana-Farber Cancer Institute, developed an algorithm that interprets the data from single nucleotide polymorphism (SNP) arrays, a collection of short oligonucleotides (“oligos”) used to tally SNP patterns in human DNA. Probe-level allele-specific quantitation (PLASQ), described in the November issue of PLoS Computational Biology, allows scientists to approximate DNA copy number at sites throughout the genome and to assign the proportion furnished by each parental chromosome.
“In cancer, genome modifications often affect only one of the two paired chromosomes, the one inherited from the father or the one contributed by the mother,” said Meyerson. “PLASQ allows us to localize these changes to the culprit chromosome, which will help guide us to the most significant genes and gene mutations in the disease.”
In SNP arrays, oligos are parsed into probe sets and each set, comprised of 40 probes, is tailored to detect a single SNP in the human genome. These probe sets consist of one probe that perfectly matches both the target SNP and its surrounding DNA, as well as several related probes, each harboring single letter mismatches. To compute DNA copy number, PLASQ exploits the mathematical link between a probe’s intensity — the readout from SNP arrays — and the known position of a mismatch relative to the target SNP. PLASQ also reveals the number of copies contributed by each parental chromosome. The researchers validated the procedure by applying it to DNA samples that had been independently analyzed by several institutions within the International HapMap Project consortium and found the results given by PLASQ to be in agreement in more than 99% of cases.
Meyerson and his colleagues then used the algorithm to look for changes in DNA copy number in more than 100 lung cancer samples and discovered a multitude of gene deletions and amplifications. They noted that most of the amplifications appear to be monoallelic, which means they derive from only one parental chromosome. While this finding extends from the use of PLASQ, it also agrees with the chromosomal acrobatics that are believed to underlie gene amplification.
The researchers turned their attention to an amplification (“amplicon”) that covers the EGFR gene, which in addition to being amplified, is also frequently mutated and rendered abnormally active in some lung cancer samples. Combing through the surplus copies, they noted that those in excess came exclusively from the mutated allele, while the normal EGFR allele was present in typical number. Therefore, with the help of PLASQ, ӳý scientists unearthed the preferential amplification of a mutant allele over its wildtype sibling.
“Chromosomal segments may be targeted for amplification in tumors because they contain a heritable or germline change that confers a distinct growth advantage,” said Tom LaFramboise, a computational biologist in the ӳý’s Cancer Program and the study’s lead author. “Since PLASQ provides a snapshot of the relevant SNP patterns contained in tumor amplicons, we may now be able to find these genetic variants using linkage analysis.”
PLASQ may also alleviate a frequent problem that plagues the analysis of tumor cells. When tumors are isolated, normal cells frequently accompany their cancerous counterparts, which complicate readings from a tumor’s genome. Researchers may be able to adapt the mathematic terms of PLASQ to solve this predicament.