Grants and Contributions:

Title:
Efficient and Effective Search over Graph-like Databases
Agreement Number:
RGPIN
Agreement Value:
$100,000.00
Agreement Date:
May 10, 2017 -
Organization:
Natural Sciences and Engineering Research Council of Canada
Location:
Ontario, CA
Reference Number:
GC-2017-Q1-02365
Agreement Type:
Grant
Report Type:
Grants and Contributions
Additional Information:

Grant or Award spanning more than one fiscal year. (2017-2018 to 2022-2023)

Recipient's Legal Name:
Kargar, Mehdi (University of Windsor)
Program:
Discovery Grants Program - Individual
Program Purpose:

Much of the world’s high-quality enterprise and social data are stored as semi-structured and structured data. This includes enterprises’ RDBMSs, knowledge graphs, and social networks. All these data collections either are defined already as graphs or can be re-modeled as graphs. Over the past decade, we have witnessed advances in storing graph-like databases, but we have not seen much progress in search over them. As Surajit Chaudhuri (a distinguished scientist at Microsoft Research) addressed in his keynote talk at ICDE in 2015, search over graph-like databases has fallen behind search over unstructured data. While scientists and business users look for exciting, actionable discoveries from their heterogeneous datasets, the need to provide effective search is profound.
In this proposed research, we focus on designing effective and efficient methods to explore graph databases. We address important problems, challenges and opportunities for improving knowledge exploration over graph-like databases. These issues arise due to the complexity, scale and massive heterogeneity of such data.
First, we tackle the problem of finding relevant answers to search over heterogeneous graphs using the keyword search paradigm. Real graphs (e.g., social networks) are heterogeneous and model various types of entities and relationships. In these graphs, each node is associated with an importance value corresponding to its semantics. Previous work ranks answers using a combination of structural and content-based metrics, and ignore the type and importance of nodes. By incorporating the importance of nodes into account, we propose efficient algorithms to find relevant answers for the given query. Second, we design new algorithms to answer distance queries (i.e., finding shortest distance between any pair of nodes) over weighted graphs based on a graph indexing method called 2-hop cover. We investigate how graph partitioning can be applied to build the index and how to efficiently update the index over a stream of graph data. Third, we investigate the problem of identifying a user’s intention when searching over knowledge graphs. Most of the current work in this area focuses only on finding answers quickly rather than finding more meaningful answers. We investigate the problem of finding a keyword’s role to improve search quality.
The results of this proposed research will be useful for Canadian and international businesses and government institutions. The proposed frameworks can be used by financial (e.g., TD Bank and stock market), healthcare, governmental institutions (e.g., Statistics Canada), and technological companies (e.g., IBM and Microsoft). Our program will train students in the databases and data mining area to place them in a strong position when applying for academic and industrial jobs. I expect up to twelve students (including undergraduate students) to be trained in this program.