A mixed-model approach for powerful testing of genetic associations with cancer risk incorporating tumor characteristics

时间:2020-01-10         阅读:

光华讲坛——社会名流与企业家论坛第 5685

主题:A mixed-model approach for powerful testing of genetic associations with cancer risk incorporating tumor characteristics

主讲人:哈佛大学 张豪宇博士

主持人:统计学院 刘耀午

时间:2020年1月13日(星期一)15:00-16:00

地点:www.6165.com光华校区光华楼1007会议室

主办单位:数据科学与商业智能联合实验室 统计学院 科研处

主讲人概况:

张豪宇博士现为哈佛大学公共卫生学院生物统计系博士后,导师是林希虹院士。他在浙江大学数学系完成本科学习后,在约翰霍普金斯大学生物统计系取得博士学位,导师是Nilanjan Chatterjee教授。他的主要研究兴趣为统计遗传学。

内容摘要:

Cancers are routinely classified into subtypes according to various features, including histo-pathological characteristics and molecular markers. Previous genome-wide association studies (GWAS) have reported heterogeneous association between loci and cancer subtypes. However, it is not evident what is the optimal modeling strategy for handling correlated tumor features, missing data, and increased degrees-of-freedom in the underlying tests of associations. We propose score tests for genetic associations using a mixed-effect two-stage polytomous model (MTOP). In the first stage, a standard polytomous model is used to specify all possible subtypes defined by the cross-classification of the tumor characteristics of interest. In the second stage, the subtype-specific case-control odds ratios are specified using a more parsimonious model based on the case-control odds ratio for a baseline subtype, and the case-case parameters associated with tumor markers. Further, to reduce the degrees-of-freedom, we specify case-case parameters for additional exploratory markers using a random-effect model. We use the Expectation-Maximization (EM) algorithm to account for missing data on tumor markers. Through simulations across a range of realistic scenarios and data from the Polish Breast Cancer Study (PBCS), we show MTOP outperforms alternative methods for identifying heterogeneous associations between risk loci and tumor subtypes. We also identified 32 novel breast cancer susceptibility loci using both standard methods and MTOP from a GWAS analysis including 133,384 breast cancer cases and 113,789 controls, plus 18,908 BRCA1 mutation carriers (9,414 with breast cancer) of European ancestry. .

根据像组织病理学特征和分子标记这些特征,通常可以划分癌症的亚型。已有的全基因组关联研究(GWAS)方法可以检测基因和癌症亚型之间的异质关联。然而,在潜在的关联测试中,在处理相关肿瘤特征、缺失数据和增加的自由度时并不清楚最优模型策略是什么。大家提出了一种使用混合效应的两阶段多分类模型(MTOP)。在第一阶段,使用一个标准的多分类模型来识别肿瘤特征交叉分类所定义的所有可能的亚型。在第二阶段,根据基线亚型的病例-控制比和与肿瘤标记物相关的病例-病例参数,使用更简洁的模型来指定亚型-特异性病例-控制比。此外,为了减少自由度,大家使用随机效应模型为其他探索性标记估计病例-病例参数。大家使用希望最大化(EM)算法来估计肿瘤标记物上缺失的数据。通过仿真模拟和对波兰乳腺癌研究(PBCS)数据分析,大家发现MTOP在识别风险位点和肿瘤亚型之间的异质关联方面优于其他方法。大家还用标准方法和MTOP对GWAS分析中涉及的133384例乳腺癌病例和113789例对照,以及18908例BRCA1突变携带者(9414例乳腺癌)数据进行分析,识别出了32个新的易患乳腺癌基因。

XML 地图 | Sitemap 地图