Grants and Contributions:

Title:
Probabilistic methods in computer science and machine learning
Agreement Number:
RGPIN
Agreement Value:
$355,000.00
Agreement Date:
May 10, 2017 -
Organization:
Natural Sciences and Engineering Research Council of Canada
Location:
Quebec, CA
Reference Number:
GC-2017-Q1-01489
Agreement Type:
Grant
Report Type:
Grants and Contributions
Additional Information:

Grant or Award spanning more than one fiscal year. (2017-2018 to 2022-2023)

Recipient's Legal Name:
Devroye, Luc (McGill University)
Program:
Discovery Grants Program - Individual
Program Purpose:

We seek to further the understanding of probabilistic phenomena that occur in the design and behavior of algorithms, data structures, networks, graphs and pattern recognition (machine learning) methods. The long-term plan has four axes of research:

(1) The in-depth analysis of random trees and random structures that are practically relevant. We will in particular focus on families of randomized trees that act on arbitrary non-random high-dimensional data, on random tries (in the context of the bit model of complexity), and on hyperplane search trees.

(2) Networks with geometric, connectivity, directional, or other restrictions appear in many contexts. We propose to study broad classes of them and pay particular attention to connectivity thresholds, emergence of giant components, diameter, and structural properties in general. Included are Kademlia and peer-to-peer networks, random Erdos-Renyi graphs with external high-dimensional parameters, and various brands of random geometric graphs.

(3) The continued effort to design universally consistent but also computationally efficient classifiers, with particular attention given to tree classifiers. Random forest classifiers are of special interest in view of their simplicity. As part of this effort, we will try to understand, theoretically, why deep learning networks are often successful, and investigate the relationship between the accuracy of these learning networks and their complexity beyond classical measures such as the Vapnik- Chervonenkis dimension.

(4) The development of new paradigms for random variate generation and the exact simulation of objects or processes that hitherto could either not be generated in an exact manner, or could at best be generated inefficiently. In addition, we will develop Shannon style information-theoretic lower bounds, and hopefully matching upper bounds, for the expected number of random fair bits needed to generate random variables with a desired accuracy.