Project Description

Significant Random Signatures Have Information

Determining cancer signatures, cancer genes and pathways deregulated by them, is a challenging task in human cancer research. Over the last few years, different algorithms are developed to predict signatures related to human cancer by computing their association to the disease outcome. In 2013, David Venet et.al. reported that gene signatures unrelated to cancer are significantly associated with breast cancer outcome. They compared 47 published breast cancer signatures to the random signatures of identical size and found that 60% of these signatures are not significantly better outcome predictors than random signatures.
In this research, we show that significant random signatures have information. Based on these informative random signatures, a score is assigned to each gene. Then a PPI network is obtained from String Database. Combined scores from String are determined as interaction weights between proteins and gene scores are assigned to related protein nodes. This network is used to diffuse the gene scores using the diffusion kernel approach proposed by Kondor and Lafferty. To determine the significance of diffusion scores, permutation procedure is used. For defining significant signatures, 10% of first diffusion score genes are enriched into pathways using ConsensusPathDB. The significant pathways are selected and the enriched genes within these pathways are considered as significant signatures. For evaluation, we use the ACES database defined by Staiger et al. in 2012, which is a cohort of 1606 breast cancer samples collected from 12 studies in NCBIs Gene Expression Omnibus. Results show that predicted signatures are significantly associated with the outcome.