Grants and Contributions:

Back to search

Title:

Statistical Methods for High-Dimensional Administrative Data

Agreement Number:

RGPIN

Agreement Value:

$150,000.00

Agreement Date:

May 10, 2017 -

Organization:

Natural Sciences and Engineering Research Council of Canada

Location:

Quebec, CA

Reference Number:

GC-2017-Q1-01931

Agreement Type:

Grant

Report Type:

Grants and Contributions

Additional Information:

Grant or Award spanning more than one fiscal year. (2017-2018 to 2022-2023)

Recipient's Legal Name:

Platt, Robert (McGill University)

Program:

Discovery Grants Program - Individual

Program Purpose:

My research addresses two main foci: first, I develop and evaluate statistical models and methods that may lead to applications in the analysis of administrative databases. Second, I develop methods for meta-analysis.

The critical challenge in estimating causal effects from observational data is confounding. Confounding is a particular challenge when using administrative data, because the researcher does not have control over variables included in the dataset; there is no guarantee that all necessary confounders have been measured.

Targeted learning (TL) is a framework developed by van der Laan to estimate causal effects, focusing inference on a target parameter and tailoring the estimation process for efficient estimation of that parameter. TL is computationally intensive, and its properties in high-dimensional data are relatively unknown.

Cumulative meta-analysis aggregates information from studies in chronological order, summarizing knowledge after each study. We have developed frequentist and Bayesian stopping rules which use information from the existing studies to determine the expected value of information from a new study.

Accelerated failure time models for survival data have appealing properties for causal inference; however, much work remains to determine their properties.

The objectives of my NSERC-funded research over the next five years are:
1. To develop and extend methods for targeted learning for use with large administrative datasets.
2. To extend stopping-rule methods for cumulative meta-analysis.
3. To extend accelerated failure time (AFT) models for causal inference in survival analysis

My focus with TL will be on developing computationally efficient algorithms. Administrative datasets often include many covariates with weak associations with exposure and outcome; summaries may be useful. I will develop computationally efficient algorithms to extend the targeted learning framework. I will evaluate these methods primarily via theoretical development and via simulation.

I will extend our meta-analytic methods. Existing methods use the fixed-effects likelihood. It is relatively straightforward to extend methods to account for observational studies (which involves addressing confounding control across studies), and multiple studies. The subsequent likelihood can be modified using existing methods for random effects. I will evaluate these methods through theoretical development, through example datasets, and via simulation.

Finally, AFT models are relatively under-utilized but highly interpretable methods for survival outcomes. I am working on extensions of AFT models to dynamic treatment regimes and to model flexible functional forms.

These results will have significant impact on research. Development and careful, rigorous evaluation of statistical tools, such as those proposed here, is essential to the conduct of applied research.