Nowadays, all kinds of businesses need to deal with real-time information in order to successfully deliver their core services. From social networks to security management, from real users to virtual processes, from high-level dashboards to IT monitoring… more and more sectors, consumers, and business processes require quick answers updated in real time.
Some initiatives have tried to solve this problem, but until now most of them were complex or obsolete while others were not open source. For that reason Stratio created SPARKTA: an open source and full-featured platform for real-time analytics, based on Apache Spark.
With absolutely no coding, you can simultaneously deploy several user-defined aggregation workflows, where you can decide which rollups and dimensions will be applied to the event stream, in real-time. Each workflow has its own aggregation policy where you can select which input (Kafka, Flume, Twitter, etc.), output (MongoDB, Cassandra, etc.), event parser functions (decoding, enrichment, normalization), and aggregation functions (time-based, geo-range, hierarchical counting, sum, max, min, count, sumsquares, etc.) will be executed by SPARKTA.
Moreover, the query services layer allows you to access the data easily, e.g., time-range queries with automatic selection of the best rollup, or ad-hoc aggregation for this subset of data.
SPARKTA was also designed to be highly configurable and extensible, and since it is pure Spark, it will also benefit also from the entire Spark ecosystem.
Thanks to this technology, real-time analysis is readily available for every use case: SPARKTA is easy to deploy, but also fast, scalable, and fault-tolerant.
Oscar Méndez is co-founder and CEO of Paradigma Tecnólogico and Stratio. Paradigma is an software solutions company with clients, mostly enterprise and large Internet companies, in Spain. Stratio uses the best of breed of Big Data technologies to cater for clients world-wide.
Working as a big data architect at Stratio, David Morales has been involved in the inception and evolution of some modules included in the Stratio platform, especially those related to data visualization, real-time, streaming, and complex event proccesing.
Comments on this page are now closed.
©2015, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.