A Bayesian Approach for Integrated Cancer Genome Profiling 2
Apart from the sex chromosome, the human genome contains two copies of each chromosome. Although the major part of the genome is exactly equal among humans, there are about 10 millions of, so-called, single nucleotide polymorphisms (SNPs) that distinguish us. A SNP consists of a single nucleotide that, at fixed position, can assume two different values (called alleles). Since there are two copies for each SNP (one derived from the mother and one from the father), the two alleles can assume either the same value (homozygous) or different value (heterozygous). Aberrations of the genome are responsible of diseases, such as tumor and Alzheimer. For example, events that can modify a genomic region are changes in the DNA copy number or so-called loss of heterozygosity (LOH). The former is due to amplification or deletion of a sequence of nucleotides, while the latter occurs only at heterozygous SNPs and consists in the deletion of one allele (leading to homozygosity). Microarray technology is able to measure these events, but the data are very noisy and complex. Understanding the aberrations in the genome, is important from a molecular biological point of view to find genes responsible of diseases that can be used as target of drugs or for patient diagnosis. The principal aim of our project is to develop Bayesian statistical methods for DNA copy number estimation and for LOH estimation. In particular, we want to develop methods that are precise in the determination of the exact position of the endpoints of the aberration regions. Moreover, we are going to develop a Bayesian method to integrate LOH and DNA copy number estimation. Using both information, we can achieve a more precise explanation of what type of events altered the genome. Finally, we would like to combine these results with gene-expression data (data provided by measurement of the mRNA copy number) to see for which genes the altered copy number leads to a quantitative different production of the corresponding protein with respect to normal samples.