Grants and Contributions:
Grant or Award spanning more than one fiscal year. (2017-2018 to 2022-2023)
Large-scale and complex data collected in modern science and technology impose tremendous challenges for traditional statistical methods and theory. Functional data analysis (FDA) has emerged as a promising field that employs random functions as units and is designed to model data distributed over continua such as time, space, wavelength and so on. Such data may be viewed as realizations of random processes and are commonly found in many fields, e.g., longitudinal studies, microarray experiments, medical images, internet commerce and financial markets. A fundamental issue in modelling functional data arises from the “curse of dimensionality”, as functional objects are conceptually framed as infinite-dimensional processes. Further, greater challenges have emerged for functional data that are collected on a large scale and with complex structures, which naturally relates to the recently revolutionized fields, such as high-dimensional data analysis and compressed sensing. Therefore, the long-term objective of this research program is to
(L1) develop a wide range of flexible yet practical methods to accommodate the large-scale and complex functional data, such as massive sampling schemes, ultrahigh number of functions and manifold structures;
(L2) establish foundational frameworks, suitable regularization and inferential methods tailored for such functional models, combining the strength of high-dimensional techniques with functional approaches.
Specifically, the proposed research focuses on the following two themes :
(1) Gaussian Sequences to Functional Data: Equivalence, Recovery and Inference
(2) Functional Regression: Estimation and Inference on Large Scale and Manifolds
In the first theme, we aim to establish a rigorous connection between multiple Gaussian sequences and functional data through Le Cam's equivalence theory. This transformative approach provides a simplified but foundational framework, similar to relating a single Gaussian sequence to nonparametric regression when wavelet shrinkage was introduced. This will lead to a series of novel developments, e.g., optimal recovery and adaptive inference on functional objects, which will enjoy rigorous theoretical guarantees and substantial computation gains, particularly for massive schemes. The second theme concerns the regression involving functional objects. I will expand on the scope of functional regression to embrace large-scale scenarios with ultrahigh number of functional predictors, and develop effective regularized estimation and rigorous functional inference. A further investigation will be conducted for adaptively modelling functional regression with random manifold embeddings. This theme is expected to set ground for a variety of large-scale and complex functional regression models, and generate a broad impact in both FDA and complex/high-dimensional fields.