Grants and Contributions:

Title:
Optimizing compute task scheduling at Shopify
Agreement Number:
EGP
Agreement Value:
$25,000.00
Agreement Date:
Aug 23, 2017 -
Organization:
Natural Sciences and Engineering Research Council of Canada
Location:
British Columbia, CA
Reference Number:
GC-2017-Q2-00339
Agreement Type:
Grant
Report Type:
Grants and Contributions
Additional Information:

Grant or Award spanning more than one fiscal year (2017-2018 to 2018-2019).

Recipient's Legal Name:
Beschastnikh, Ivan (The University of British Columbia)
Program:
Engage Grants for Universities
Program Purpose:

Big Data processing frameworks allow enterprises to drive their business based on key insights extracted fromx000D
near real-time data. Apache Spark is a framework that Shopify leverages to collect and analyze customerx000D
information and expose data insights to the shop owners who are users of the Shopify platform. For example,x000D
the enterprise which built its online store using the Shopify platform can send pre-set queries to the platform tox000D
learn about the impact of their recent marketing campaign (e.g., with an improved Google ad keywords) byx000D
geographical region or by age group, and see if the marketing campaign resulted in actual product sales.x000D
Shopify uses Spark to analyze data by passing the data through a processing pipeline. The pipeline consists ofx000D
multiple jobs, each applying some computation to the data. A set of connected jobs form a flow which mightx000D
correspond to the query exposed to the shop owner. In this project we will explore algorithmic solutions tox000D
maintain a useful flow execution frequency while absorbing the variance caused by job failures, and ensuringx000D
co-execution of pre-scheduled and ad-hoc jobs without resource starvation. We intend to apply approachesx000D
based on heuristics and constraint satisfaction, study their tradeoffs, and develop a solution that works best inx000D
practice.