Structured Streaming is a new effort in Apache Spark to make stream processing simple without the need to learn a new programming paradigm or system. Ram Sriharsha offers an overview of Structured Streaming, discussing its support for event-time, out-of-order/delayed data, sessionization, and integration with the batch data stack to show how it simplifies building powerful continuous applications.
Ram Sriharsha is the product manager for Apache Spark at Databricks and an Apache Spark committer and PMC member. Previously, Ram was architect of Spark and data science at Hortonworks and principal research scientist at Yahoo Labs, where he worked on scalable machine learning and data science. He holds a PhD in theoretical physics from the University of Maryland and a BTech in electronics from the Indian Institute of Technology, Madras.
Comments on this page are now closed.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.