Presented By O'Reilly and Cloudera
Make Data Work
March 13–14, 2017: Training
March 14–16, 2017: Tutorials & Conference
San Jose, CA

Schedule: Platform Security and Cybersecurity sessions

Explore the role of data & algorithms in improving security, and the challenges of the constant race with adversaries who try to game the algorithms.

Add to your personal schedule
1:30pm5:00pm Tuesday, March 14, 2017
Location: LL21 A Level: Intermediate
Mark Donsky (Okera), Andre Araujo (Cloudera), Michael Yoder (Cloudera), Manish Ahluwalia (Nerdwallet)
Average rating: ****.
(4.50, 2 ratings)
Mark Donsky, André Araujo, Michael Yoder, and Manish Ahluwalia walk you through securing a Hadoop cluster. You’ll start with a cluster with no security and then add security features related to authentication, authorization, encryption of data at rest, encryption of data in transit, and complete data governance. Read more.
Add to your personal schedule
11:00am11:40am Wednesday, March 15, 2017
Location: LL21 B Level: Intermediate
Secondary topics:  Cloud
Ram Shankar Siva Kumar (Microsoft (Azure Security Data Science)), Andrew Wicker (Microsoft (Azure Security Data Science))
Average rating: ****.
(4.50, 4 ratings)
Ram Shankar Siva Kumar and Andrew Wicker explain how to operationalize security analytics for production in the cloud, covering a framework for assessing the impact of compliance on model design, six strategies and their trade-offs to generate labeled attack data for model evaluation, key metrics for measuring security analytics efficacy, and tips to scale anomaly detection systems in the cloud. Read more.
Add to your personal schedule
11:50am12:30pm Wednesday, March 15, 2017
Location: LL21 B Level: Intermediate
Cesar Berho (Intel), Alan Ross (Intel)
Average rating: **...
(2.00, 3 ratings)
Cesar Berho and Alan Ross offer an overview of open source project Apache Spot (incubating), which delivers next-generation cybersecurity analytics architecture through unsupervised learning using machine-learning techniques at cloud scale for anomaly detection. Read more.
Add to your personal schedule
1:50pm2:30pm Wednesday, March 15, 2017
Location: LL21 B
Ting-Fang Yen (DataVisor)
Average rating: ****.
(4.33, 3 ratings)
When it comes to visibility into account takeover, spam, and fake accounts, the cloud is making things hazy. Cloud-hosted attacks skirt IP blacklists and make fraudulent users seem like they are located somewhere they are not. Drawing on data from 500 billion events and 400 million user accounts, Ting-Fang Yen examines cloud-based attack trends across verticals and regions. Read more.
Add to your personal schedule
2:40pm3:20pm Wednesday, March 15, 2017
Location: LL21 B Level: Intermediate
Secondary topics:  Architecture, Data Platform, Financial services, Streaming
Ajit Gaddam (VISA), Jiphun Satapathy (VISA)
Average rating: ***..
(3.83, 6 ratings)
Apache Kafka is used by over 35% of Fortune 500 companies to store and process some of their most sensitive datasets. Ajit Gaddam and Jiphun Satapathy provide a security reference architecture to secure your Kafka cluster while leveraging it to support your organization's cybersecurity requirements. Read more.
Add to your personal schedule
4:20pm5:00pm Wednesday, March 15, 2017
Location: LL21 B Level: Intermediate
Yuliya Feldman (Dremio Corporation), Bill ODonnell (Mapr)
Average rating: **...
(2.50, 2 ratings)
Security will always be very important in the world of big data, but the choices today mostly start with Kerberos. Does that mean setting up security is always going to be painful? What if your company standardizes on other security alternatives? What if you want to have the freedom to decide what security type to support? Yuliya Feldman and Bill ODonnell discuss your options. Read more.
Add to your personal schedule
5:10pm5:50pm Wednesday, March 15, 2017
Location: LL21 B
Parvez Ahammad (BlackThorn Therapeutics)
Average rating: ****.
(4.80, 5 ratings)
Recently, research on applying and designing ML algorithms and systems for security has grown quickly as information and communications have become more ubiquitous and more data has become available. Parvez Ahammad explores generalized system designs, underlying assumptions, and use cases for applying ML in security. Read more.
Add to your personal schedule
11:00am11:40am Thursday, March 16, 2017
Location: LL21 B Level: Beginner
Secondary topics:  ecommerce, Media
Yinglian Xie (DataVisor)
How many of your users are really fraudsters waiting to strike? These sleeper cells exist in all online communities. Using data from more than 400M users and 500B events from online services across the world, Yinglian Xie explores sleeper cells, explains sophisticated attack techniques being used to evade detection, and shows how Spark's in-memory big data security analytics can help. Read more.
Add to your personal schedule
2:40pm3:20pm Thursday, March 16, 2017
Location: LL21 C/D Level: Advanced
Secondary topics:  Hardcore Data Science
Alexander Ulanov (Hewlett Packard Labs), Manish Marwah (Hewlett Packard Labs)
Alexander Ulanov and Manish Marwah explain how they implemented a scalable version of loopy belief propagation (BP) for Apache Spark, applying BP to large web-crawl data to infer the probability of websites to be malicious. Applications of BP include fraud detection, malware detection, computer vision, and customer retention. Read more.