Grants and Contributions:
Grant or Award spanning more than one fiscal year. (2017-2018 to 2022-2023)
Response-Dependent Two-Phase Designs for Dynamic Processes
I will develop innovative ways of selecting individuals for biomarker sub-studies in large cohorts aimed at examining the relationship between genetic and serum biomarkers to longitudinal responses which may involve repeated measures and time-to-event data. Phase I samples involve the collection of longitudinal or life history data along with specimens stored in biobanks. Phase II samples are defined by the set of individuals for whom the stored specimens are assayed and can then be used in analyses. Algorithms for the optimal selection of Phase II samples will be developed based on large sample theory with different selection models obtained depending on the framework for the longitudinal analysis (i.e. marginal analyses, hierarchical mixed effect models, transition models). Methods for the secondary analysis of data from case-cohort designs will also be developed using likelihood and weighted estimating functions to correct for biased samples. The case-cohort and nested case-control designs will be generalized to deal with more complex multi-state disease processes.
Efficient Design of Tracing Studies for Cohorts with Attrition
I will develop frameworks for the efficient design of tracing studies geared toward the collection of auxiliary data on the disease and withdrawal process in cohort studies. These data facilitate correction for biases arising from data which are MNAR and enhance efficiency. Frameworks for analysis include illness-death models where interest lies in the development of co-morbidities or disease complications, progressive models for degenerative conditions, and survival models. Use of auxiliary covariate data and inverse probability weights will ensure valid inference. If individuals do not consent to be traced then additional weights for non-response will be incorporated. Moreover traced individuals may only provide partial or surrogate responses, in which case methods accommodating incomplete or mismeasured responses will warrant development.
Design and Analysis of Family Studies with Auxiliary Data
Genetic, phenotype (e.g. age of onset) and auxiliary data can be exploited to develop efficient selection models for choosing probands from disease registries for inclusion in family studies. I will develop sample selection models which incorporate such information, as well as possible surrogate summary proband-reported data on family members. The biased sample scheme employed in family studies means there is little information on disease onset times. I will explore the utility of auxiliary data in the development of score tests of genetic associations. While work has been carried out on the analysis of family data for binary and time-to-event outcomes, little has been done for modeling within-family dependencies in more complex multistage disease processes; I will focus on this.