Over the last few years, the traditional data center has been in a state of constant evolution. The arrival of cloud services and XaaS has introduced a new paradigm on the computing age, as well as on visibility and controls on this space, as it becomes an extension of the business network. In this new world, security is of the utmost importance. Existing threat tools can help, but it’s very expensive to analyze data at such a large scale and get actionable insights. Cybersecurity demands scale, and big data analytics and machine learning are the current top choices for success.
A community-based approach to information security is needed. Cesar Berho and Alan Ross offer an overview of open source project Apache Spot (incubating), which delivers next-generation cybersecurity analytics architecture through unsupervised learning using machine-learning techniques at cloud scale for anomaly detection. Apache Spot represents a great place for interested individuals to contribute to and help define an open data model that provides a standard format for enriched event data that makes it easier to integrate cross-application data to gain complete enterprise visibility and develop net new analytic functionality. Open data models help organizations quickly share new analytics with one another as new threats are discovered, and with Hadoop, organizations are able to run these analytics against comprehensive historic datasets, helping them identify past threats that have slipped through the cracks, giving security professionals the ability to collaborate like cybercriminals do.
Apache Spot’s approach involves several key processes to facilitate collection, storage, processing, and presentation of telemetry sources. As of today, current contributions are oriented to network use cases like network flows (nfcapd), DNS (PCAP), and proxies, and Apache Spot’s solutions are founded on a parallel ingest framework using Kafka, open source decoders that load data in Hadoop with Spark Streaming, machine learning that helps to filter billions of events to a few thousands, finding those outliers that can represent the needle on the haystack using unsupervised learning, and operational analytics. Community contribution is open and has a huge potential for the creation of enhanced and additional algorithms that can pick up broader event data types, on the endpoint or based on identity; inhance correlation for incident response; enter into predictive research and be able to observe at large scale potential threats in the near term; root cause analysis, which is especially useful on forensics and threat remediation; and a wider scope of analysis going beyond the traditional network architecture—observing things on SDN, security controllers, microservices, and making known the things that represent a black box today.
Cesar Berho is a senior security researcher at Intel and a committer to the Apache Spot project. Cesar has 12 years of experience working within the cybersecurity industry in positions in operations, design, engineering, and research. Recently, he has been focusing on new ways to analyze telemetry sources with analytics and benchmarking security implementations.
Alan Ross is a senior principal engineer and chief cloud security architect at Intel. Alan has more than 20 years of information security experience in various capacities, from policy and awareness and security/risk analysis to engineering and architecture. Previously, Alan worked as a security administrator and engineer for two global companies, focusing on network, host, and application security. He has 21 US patents and many others pending relating to security and manageability of systems and networks. Alan is currently leading activities around Open Network Insight, an open source project for advanced analytics of network telemetry.
Comments on this page are now closed.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.