Presented By
O’Reilly + Intel AI
Put AI to Work
April 15-18, 2019
New York, NY
Discover opportunities for applied AI
Organizations that successfully apply AI innovate and compete more effectively. How is AI transforming your business?
Be a part of the program—apply to speak by October 16.

Adversarial machine learning in digital forensics

Alina Matyukhina (Canadian Institute for Cybersecurity)
2:40pm3:20pm Thursday, April 18, 2019
Case Studies, Machine Learning
Location: Sutton South
Secondary topics:  AI case studies, Computer Vision, Ethics, Privacy, and Security, Models and Methods, Reinforcement Learning

Who is this presentation for?

machine learning engineers, open-source developers, adversarial machine learning researchers, security practitioners



Prerequisite knowledge

Some programming

What you'll learn

1. Learn about adversarial machine learning, digital forensics, software authorship attribution techniques 2. Find out how the attacker can mimic the coding style of a software developer in open-source projects using machine learning techniques 3. Find out how protect yourself in open-source software projects.


Digital forensics becomes very important when issues about authors of documents arise, such as their identity and characteristics (age, gender) and ability to associate them with unknown documents.

Machine learning approaches to source code authorship identification attempt to identify the most likely author of a piece of code by analyzing various characteristics from source code. There are many situations in which police or security agencies are concerned about the ownership of software, for example, to identify who wrote a malicious piece of code.

However, machine learning models are often susceptible to adversarial deception of their input at test time, which is leading to a poorer performance. Recent studies in adversarial machine learning showed that adversarial examples can easily attack image classification, speech recognition, and reinforcement learning.

In this session we will investigate the feasibility of deception in source code attribution techniques in real world environment, which contains adversaries and dishonest users.

In this session, we will show that even a sensible transformation of author’s coding style successfully decrease the performance of source code authorship attribution systems. An important part of this session’s content will include practical attacks on current attribution systems: author imitation and author hiding. The first attack can be applied on user identity in open-source projects. The attack transforms attacker’s source code to a version that mimics the victim’s coding style while retaining functionality of original code. This is particularly concerning for open-source contributors who are unaware of the fact that by contributing to open-source projects they reveal identifiable information that can be used to their disadvantage. For example, one can easily see that by imitating someone’s coding style it is possible to implicate any software developer in wrongdoing. To resist this attack we discuss multiple approaches of hiding a coding style of software author before contribute to open-source.

Photo of Alina Matyukhina

Alina Matyukhina

Canadian Institute for Cybersecurity

Alina Matyukhina is a cyber security researcher and 3rd-year PhD candidate at Canadian Institute for Cybersecurity (CIC). Her research work focuses on applying machine learning, computational intelligence, and data analysis techniques to design innovative security solutions. Before joining CIC, she worked as a research assistant at Swiss Federal Institute of Technology where she took part in cryptography and security research projects. Both her B.S. and M.S. was completed in Math and IT. Alina is a member of the Association for Computing Machinery, the IEEE Computer Society. Alina is presenting her research at several security and software engineering conferences including HackFest, IdentityNorth, ISACA Security & Risk, Droidcon SF, and PyCon Canada.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)