Quantitative Queries over Streaming Data


Konstantinos Mamouras

Event Details
Tuesday, April 4, 2017
Talk:
4-5 p.m., Avery 115

Reception:
3:30 p.m., Avery 348

Konstantinos Mamouras, Ph.D.

Postdoctoral Researcher, University of Pennsylvania

Abstract

Real-time decision making in emerging IoT applications typically relies on computing quantitative summaries of large data streams in an efficient and incremental manner. In this talk, I will present the StreamQRE language, which simplifies the task of programming the desired logic is such applications. StreamQRE provides natural and high-level operations for processing streaming data, and has a novel integration of linguistic constructs from two distinct programming paradigms: streaming extensions of relational query languages and quantitative extensions of regular expressions. The former allows the programmer to employ relational constructs to partition the input data by keys and to integrate data streams from different sources, while the latter can be used to exploit the logical hierarchy in the input stream for modular specifications. First, I will present the core StreamQRE language consisting of a small set of combinators, which can express a number of common streaming transformations, such as filtering, mapping, and windowing. A compilation algorithm translates the high-level query into a streaming algorithm with precise complexity bounds on per-item processing time and total memory footprint. It is also possible to integrate approximation algorithms in the StreamQRE framework. Finally, I will discuss an implementation of StreamQRE in Java, and its evaluation with respect to existing high-performance engines for processing streaming data. The experimental evaluation shows that: (1) StreamQRE allows natural and succinct specification of queries, (2) the throughput of the StreamQRE engine is higher than comparable systems, and (3) the approximation algorithms supported can lead to substantial memory savings.

Speaker Bio

Konstantinos Mamouras is currently a postdoctoral researcher at the University of Pennsylvania. He obtained his PhD in Computer Science from Cornell University in 2015. His research is currently focused on language design aspects of data stream processing, and he has also worked on the areas of program semantics and equational theories of programs.