Grants and Contributions:
Grant or Award spanning more than one fiscal year. (2017-2018 to 2022-2023)
BACKGROUND
From the same DNA sequence, the body is able to generate a wide range of cells, such as brain or muscle. The differences between cell types are determined by which genes are active in each of them. Cells contain hundreds of thousands of regulatory regions scattered across DNA, and each gene is controlled by one or more of these regions. A gene’s activity is regulated by proteins––known as transcription factors––that recognize the gene’s regulatory regions and bind to specific DNA sequences at these locations. Recent methodological and technological advances in experimental procedures allow for the identification of the exact locations in DNA where transcription factors bind. However, it is still not possible to perform this type of assay for all >1,500 transcription factors.
RESEARCH PROJECT AND ANTICIPATED OUTCOME
Delineating the locations where transcription factors bind to DNA is fundamental towards understanding which genes they regulate. In the past decade, many computational methods have been devised to tackle this problem. However, predicting the binding locations of transcription factors in DNA is still an ongoing challenge. This is reflected by a recent competition aimed to identify the best performing method for such purpose, and that brought together scientists from all around the globe. In this project, I propose to improve the prediction of transcription factor binding locations. This will be accomplished by:
1) developing innovative models that explain how transcription factors bind to DNA based on multiple features, such as the shape of DNA, how accessible a segment of DNA is, how well conserved the DNA sequence is across species; and
2) updating our current software and databases for the prediction of transcription factor binding locations with these new models.
The developed software and tools will be freely available online for use by the research community.
BENEFITS
Expected benefits to Canada and to the worldwide research community include:
1) Due to their similar mechanisms, the regulatory regions that control the activity of genes and the transcription factors that recognize and bind to them are studied from bacteria to primates. The tools and data that will be developed in this project will benefit researchers working with any kind of organism;
2) With the costs of determining the sequence of DNA dropping rapidly, data related to the regulation of genes will be generated at unprecedented rates. To analyze these data, researchers will require improved algorithms and methods, such as the types of software and tools that will be developed in this project; and
3) Through this project, I will train highly qualified personnel in genomics and bioinformatics. The lab has an exemplary record of trainees whose successful careers have a positive impact on Canada.