Mar 15–18, 2020

Preserving privacy in behavioral analytics using semantic learning techniques in NLP

11:00am11:40am Wednesday, March 18, 2020
Location: 210 F

Who is this presentation for?

  • Data scientists, ML engineers, and IT and enterprise security personnel

Level

Intermediate

Description

Data breaches are a matter of “when” and not “if.” The need to actively monitor network activities to catch the breaches early remains paramount, while equally important is the need to maintain the privacy of the network entities (e.g., users, devices, server names). Meanwhile, homomorphic encryption, the default answer to preserve privacy and perform ML computation, is yet to become mainstream. Ramsundar Janakiraman shares an alternate approach toward building behavioral models from anonymized datasets to preserve personally identifiable information (PII). In other words—if enterprise networks were theme parks, modeling the behavior of a first-time visitor versus repeat visitors without any PII of the guests or the purpose of the rides. And further catch a first timer with compromised credentials of a frequent guest, based on how they find their way around the park.

Recently, natural language processing (NLP), has made groundbreaking progress in tapping the context and semantics in the language toward great impact in handling long sentences and eluding polysemous words. Cross-domain NLP techniques have been successfully applied to enterprise security to build semantic representations of the behavior of network entities to weed out data issues. Using one of the anonymized cybersecurity datasets, you’ll see that, by mapping use cases to a careful selection of the anonymized data sources and using hints in the dataset during the preparation of corpus, you can build semantic models of anonymized entities.

Lower-level security operation center (SOC) analysts can monitor the network using these models to get deep insights into the network interactions of anonymized entities (e.g., User1 and User10 behave the same). Such a workflow can leave the real identities to higher-level SOC personnel to determine actions for high-priority alerts, such as locating the compromised device for remediation. Using various models from NLP tasks (e.g., sequence prediction, translation efficacy) semantic models can be used as building blocks to build multidimensional access behavioral representations and predict access sequences.

Prerequisite knowledge

  • A basic understanding of machine learning and networking terminologies
  • General knowledge of NLP (useful but not required)

What you'll learn

  • Learn how simple human abstraction helps in identifying actionable threats from network anomalies
  • Explore how various NLP techniques can be used to capture semantics in network interactions from anonymized datasets
  • Understand how the techniques used, while capturing semantics, can preserve the privacy of network entities
Photo of Ramsundar Janakiraman

Ramsundar Janakiraman

Aruba

Ram Janakiraman is a distinguished engineer at the Aruba CTO Office working on machine intelligence for enterprise security. His recent focus has been on simplifying the building of behavior models by leveraging approaches in NLP and representation learning. He hopes to improve end user product engagement through a visual representation of entity interactions without compromising the privacy of the network entities. Ram has numerous patents from a variety of areas during the course of his career. Previously, he’s been in various startups and was a cofounding member of Niara, Inc., working on security analytics with a focus on threat detection and investigation before it was acquired by Aruba, a HPE Company. He’s also an avid scuba diver, always eager to explore the next reef or kelp. He’s an FAA Certified Drone Pilot, capturing the beauty of dive destinations on his trips.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires