Exploratory data analytics is often an interactive and iterative process: a data scientist forms an initial hypothesis (e.g., by visualizing the data), consults the data, adjusts her hypothesis accordingly, and repeats this process until a satisfactory answer is discovered. However, due to software and hardware inefficiencies across the stack (network shuffles, serialization overheads, graphics rendering times, etc.), achieving subsecond latencies is hardly possible, regardless of whether the data fits in the memory of a single laptop or is stored across a massive cluster. The slow and costly interactions with data can severely inhibit a data scientist’s productivity, engagement, and even creativity.
Barzan Mozafari offers an overview of Verdict, an open source middleware that uses powerful statistical techniques to ensure interactive response times for a wide class of visualization tasks and SQL analytics. The two salient features of Verdict are its universal compatibility and its ease of use. Verdict can be used with virtually any BI and SQL engine, from Impala, Hive, and Spark SQL to Tableau and Apache Hue. Likewise, Verdict does not require any statistical background from the end user or platform engineers. Barzan shares a few use cases to explain the power and simplicity of Verdict.
Barzan Mozafari is an assistant professor of computer science and engineering at the University of Michigan, Ann Arbor, where he leads a research group designing the next generation of scalable databases using advanced statistical models. Previously, Barzan was a postdoctoral associate at MIT. His research career has led to many successful open source projects, including CliffGuard (the first robust framework for database tuning), DBSeer (the first automated database diagnosis tool), and BlinkDB (the first massively parallel approximate query engine). Barzan has won the National Science Foundation CAREER award as well as several best paper awards in ACM SIGMOD and EuroSys. He is also a cofounder of DBSeer and a strategic advisor to SnappyData, a company that commercializes the ideas introduced by BlinkDB. Barzan holds a PhD in computer science from UCLA.
©2017, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com