A presentation and discussion on building compilers and optimizers for data transformations or domain-specific languages. After this presentation, any programmer should be able to write a basic compiler, and people with prior knowledge will be able to write more advanced optimizing compilers.
Introduction
Compilation, loosely described, is the automated transformation of one piece of data into another; and the optimizer, the frontend’s partner in crime, is a key decision-maker in these transformations.
A good compiler-optimizer pair can solve a map-reduce scheduling problem, a timetabling problem, or even lay out the desks in an office; and the compiler-optimizer itself can be written in under 500 lines of code. (Compiling C to binary takes about 5,000.)
Yet we treat these techniques as arcane and shy away from them in everyday code; we write serializers and greedy algorithms by hand, often with poor performance, inconsistent results or badly understood corner cases.
We can fix that!
This is a huge topic, and the presentation is based on a wealth of experience of writing compilers, in different languages, for different platforms, with very different underlying behaviours. There will be considerable opportunity for the audience to guide the talk through a selection of available sub-topics.
Optimization
Most of the really interesting problems are in the optimizer, which is probably why it isn’t often discussed or taught. A good optimizer can ensure the success of a business, can fit 20 more customers onto a piece of hardware, or can make your program run an order of magnitude faster. Making the wrong decision can be worse than making no decision, especially in an automated system.
There are also some interpreter-related party-tricks, like direct-threading, which can be used to scare people, but do make code run faster!
Applications and Front-End
We must be able to integrate our new techniques and tools into our workflow. We will discuss building incremental compilers, techniques for error management, reporting and recovery, and debugging – often touted as twice as hard as development.
During development of a program, the input is more often invalid than not, and a good frontend must handle invalid input elegantly, produce useful and appropriate error messages, and not fail at the first error. Even if we are just building a serializer, we may prefer to use compiler-building tools to construct it. The available front-end tools have reached such a level of maturity that they generally produce the most performant, reliable and correct solutions to many problems.
Also Available
It’s too much for one presentation, but let’s see where we end up!
This session is sponsored by Nebula
Shevek is an expert programmer with a strong interest in parallel and
distributed systems. He has worked on cutting edge research in compilers
and language design, algorithmic optimization, systems and security. He
is capable of maintaining a very straight face under questioning on
topics including “Why is our printer playing ‘happy birthday’?” or “What
is that message doing on the side of that building?” He received a
Doctorate in Computing on the Formalization of Protection Systems from
the University of Bath, England. He also holds a Masters in Pure
Mathematics and an epee.
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at (707) 827-7065 or scordesse@oreilly.com.
View a complete list of OSCON contacts
Comments
Shevek,
My printer doesn’t play anything and you can see from my picture, I’m actually smiling (sort of). My wife accuses me of being way too poker-faced.
I suspect you can answer why there’s a message on a building, perhaps from a technical or philosophical standpoint, of course. Non-techies do frequently ponder those things, and sometimes need insight.
Nevertheless, sorry I didn’t get to meet you. Definitely wanted to inquire about your attire.
Maybe next OSCON.
I hope you had a productive conference.
Best regards,
Rob