Presented By O’Reilly and Intel AI
Put AI to work
8-9 Oct 2018: Training
9-11 Oct 2018: Tutorials & Conference
London, UK

How machines learn to code: Machine learning on source code

Thomas Endres (TNG Technology Consulting), Samuel Hopstock (TNG Technology Consulting)
13:45–14:25 Thursday, 11 October 2018
Implementing AI
Location: King's Suite - Sandringham

Who is this presentation for?

  • Developers and data scientists

Prerequisite knowledge

  • A basic understanding of machine learning techniques
  • Familiarity with the basic principles of domains such as NLP and image recognition

What you'll learn

  • Explore models for representing code as feature vectors, ways to encode information about code quality and other metrics as output vectors, machine learning techniques suitable for this problem, and methods for generating input and output data (natural and artificial)
  • Learn how to integrate this new approach into IDEs


Machine learning on source code is a new area of research in the field of artificial intelligence, which, unlike classical problems such as image segmentation, does not yet have established standard techniques. For instance there are standard methods for processing images that make machine learning algorithms pay attention to their two-dimensionality. However, there are currently no common techniques for encoding the semantic structure of source code. Therefore, you need new ways to mathematically represent the code of projects. This technology offers a variety of possible applications, for example, in the area of static code analysis or in the automatic selection of relevant test cases.

Thomas Endres and Samuel Hopstock share methods for transferring classic machine learning approaches to this new field of expertise. Along the way, Thomas and Samuel detail approaches for both automatic and manual training data generation and offer an overview of suitable models and machine learning frameworks for this challenge. They conclude by exploring the possibilities of using such models for the analysis of code.

Photo of Thomas Endres

Thomas Endres

TNG Technology Consulting

Thomas Endres is an IT consultant at TNG Technology Consulting in Munich. Besides his normal work for TNG’s customers, he creates prototypes with the company’s hardware hacking team, such as a see-through augmented reality device and a telepresence robotics system. In his spare time, he is working on gesture control applications, such as those for controlling quadrocopters with bare hands. He’s also involved in open source projects written in Java, C#, and all kinds of JavaScript languages. In addition to all this, he’s a lecturer at the University of Applied Sciences in Landshut. Thomas is passionate about software development and all the other aspects of technology. As an Intel Software Innovator and Black Belt, he promotes new technologies like gesture control, AR/VR, and robotics around the world. He recently received a JavaOne Rockstar award. He studied IT at the TU Munich.

Photo of Samuel Hopstock

Samuel Hopstock

TNG Technology Consulting

Samuel Hopstock is working toward his bachelor’s degree in computer science at the Technical University of Munich. He’s also a working student at TNG Technology Consulting in Unterföhring, where he is currently involved in the development of software in the field of machine learning with Python and Java. He is interested in any new technological developments, especially involving Android.