Integrative Analyses of Cancer Data: A Review from a Statistical Perspective

It has become increasingly common for large-scale public data repositories and clinical settings to have multiple types of data, including high-dimensional genomics, epigenomics, and proteomics data as well as survival data, measured simultaneously for the same group of biological samples, which provides unprecedented opportunities to understand cancer mechanisms from a more comprehensive scope and to develop new cancer therapies. Nevertheless, how to interpret a wealth of data into biologically and clinically meaningful information remains very challenging. In this paper, I review recent development in statistics for integrative analyses of cancer data. Topics will cover meta-analysis of homogeneous type of data across multiple studies, integrating multiple heterogeneous genomic data types, survival analysis with high- or ultrahigh-dimensional genomic profiles, and cross-data-type prediction where both predictors and responses are high- or ultrahigh-dimensional vectors. I compare existing statistical methods and comment on potential future research problems.
Source: Cancer Informatics - Category: Cancer & Oncology Authors: Source Type: research