Title |
Application of Data Mining for Biomedical Data Processing |
Authors |
손호선(Shon, Ho-Sun) ; 김경옥(Kim, Kyoung-Ok) ; 차은종(Cha, Eun-Jong) ; 김경아(Kim, Kyung-Ah) |
DOI |
https://doi.org/10.5370/KIEE.2016.65.7.1236 |
Keywords |
Bladder cancer ; TCGA(The Cancer Genome Atlas) ; CpG island ; ROC ; Cox's regression |
Abstract |
Cancer has been the most frequent in Korea, and pathogenesis and progression of cancer have been known to be occurred through various causes and stages. Recently, the research of chromosomal and genetic disorder and the research about prognostic factor to predict occurrence, recurrence and progress of chromosomal and genetic disorder have been performed actively. In this paper, we analyzed DNA methylation data downloaded from TCGA (The Cancer Genome Atlas), open database, to research bladder cancer which is the most frequent among urinary system cancers. Using three level of methylation data which had the most preprocessing, 59 candidate CpG island were extracted from 480,000 CpG island, and then we analyzed extracted CpG island applying data mining technique. As a result, cg12840719 CpG island were analyzed significant, and in Cox's regression we can find the CpG island with high relative risk in comparison with other CpG island. Shown in the result of classification analysis, the CpG island which have high correlation with bladder cancer are cg03146993, cg07323648, cg12840719, cg14676825 and classification accuracy is about 76%. Also we found out that positive predictive value, the probability which predicts cancer in case of cancer was 72.4%. Through the verification of candidate CpG island from the result, we can utilize this method for diagnosing and treating cancer. |