Date:16 January 2025, Thursday
Location:S16-06-118, Seminar Room
Time:3pm, Singapore
Genome-wide genetic models for outcomes of interest such as a disease state have been widely used, and have led to big advances in understanding the genomic architecture of complex traits, and in predicting these traits. However, the statistical models are fundamentally ill-posed due to both “too many predictors” and correlation among the predictors. Different approaches to regularisation to overcome these problems can lead to important differences in results, and there has been confusion and controversy over model choices. I present a framework for resolving these problems that leads to important gains in the power of statistical tests and accuracy of prediction. Our framework uses shrinkage regression with a distinct shrinkage parameter for each genetic predictor (SNP), determined by fitting a low-dimensional “heritability model” with independent variables that can include genome annotation features, between-SNP correlations and minor allele fraction. Applying our modelling framework to the UK Biobank and other datasets, I will discuss advances in understanding how causal effects are distributed across the genome, as well as inferences about the effects of negative (purifying) selection, for different traits and in different genome regions.