Date:15 April 2024, Monday
Location:S16-0304
Time:3pm, Singapore
A statistical model fitted on some observed data represents and explains the mechanism underlying the data generation process. These models are structured on one or more unknown parameters that we aim to estimate using the available data. Conventional parametric statistical methods assume that the parameter vector is finite dimensional. On the contrary, nonparametric methods do not restrict the parameter space, providing more flexibility yet at a loss of efficiency compared to parametric methods. However, in most instances our primary focus lies only on a finite subset of these infinite dimensional parameters. For such cases, we may partition the parameters into finite and infinite components and fit a so-called semiparametric model while preserving some of the important properties of parametric modelling. Semiparametric models have gained much popularity in recent years due to their flexibility and broad applicability. In this thesis we propose novel semiparametric methods for two of its key applications.
Semiparametric modelling is widely adopted when handling missing data owing to its capability to capture complex structures in the observed data caused by various forms of missingness. When the missingness is ignorable, i.e. the missingness does not depend on the variable with missing values, the popular statistical methods such as maximum likelihood, imputation and inverse propensity weighting can be readily implemented. However, when the missingness is nonignorable, i.e. the missingness depends on missing values itself, statistical modelling is challenging. More recently, semiparametric methods have emerged that attempt to address this problem of nonignorable missingness. As the main goal of this thesis, we develop a novel semiparametric doubly robust and locally efficient estimator for the mean of an outcome that is subject to nonignorable missingness by utilizing an instrumental variable which affects nonresponse, but not the outcome. We additionally propose a computationally simpler estimator that preserves doubly robustness, but not local efficiency. We evaluate the performance of the proposed estimator via a simulation study and apply it in adjusting for missing data induced by HIV testing refusal in the evaluation of HIV seroprevalence in Mochudi, Botswana, using interviewer experience as an instrumental variable. We carefully extend the proposed framework to handle non-monotone nonignorable missingness in repeated outcome measures in longitudinal studies.
In the next part of the thesis, we propose a specification test to detect the presence of unmeasured mediator-outcome confounding in mediation analysis, which remains a challenge in both observational and experimental studies. Fulcher et al. (2019) has previously established semiparametric identification of mediation effects in the presence of potential unmeasured mediator-outcome confounding and proposed a consistent method of moments estimator. Building upon this work, we develop a specification test and assess its performance through extensive simulation studies.