Matrix Factorization Methods for Integrative Cancer Genomics

With the rapid development of high-throughput sequencing technologies, many groups are generating multi-platform genomic profiles (e.g., DNA methylation and gene expression) for their biological samples. This activity has generated a huge number of so-called “multidimensional genomic datasets,” providing unique opportunities and challenges to study coordination among different regulatory levels and discover underlying combinatorial patterns of cellular systems. We summarize a matrix factorization framework to address the challenge of integrating multiple genomic datasets, as well as a semi-supervised variant of the method that can incorporate prior knowledge. The basic idea is to project the different kinds of genomic data onto a common coordinate system, wherein genetic variables that are strongly correlated in a subset of samples form a multidimensional module. In the context of cancer biology, such modules reveal perturbed pathways and clinically distinct patient subgroups that would have been overlooked with only a single type of data. In summary, the matrix factorization framework can uncover associations between distinct layers of cellular activity and explain their biological implications in multidimensional data.
Source: Springer protocols feed by Cancer Research - Category: Cancer & Oncology Source Type: news