Advanced Query Manipulation with MySQL Proxy

Architecture and Technology, Replication and Scale-Out
Location: Ballroom D Level: Intermediate
Average rating: **...
(2.40, 10 ratings)

Currently MySQL Proxy only comes with an incomplete tokenizer for a subset of the MySQL dialect.

Many use cases require more knowledge about the query that a stream of tokens can provide and users are force to create their own parsers, most of which are handwritten and simplistic, in Lua.

While this is often sufficient for special cases and specific applications, it cannot serve as an extensible and robust framework.

For the purpose of query formatting in the MySQL Enterprise Monitor I have written a new parser using ANTLR, where I am a committer in the project.
Due to the nature of ANTLR generated recognizers, it is possible to target different implementation languages with little effort and thus has been integrated with MySQL Proxy to allow accurate query recognition and manipulation.

Even though a parser that accurately recognizes MySQL’s dialect is a crucial step in supporting sharding, for example,
it is only the first step. The supporting infrastructure in MySQL Proxy is at least as important.

The session will very briefly introduce ANTLR and its development tools to provide some valuable background.

I will show how to use the built-in symbol table, the produced abstract syntax tree and the built-in visitors to gather information about the query which assist the developer in safely manipulating the query.

As an example I will demonstrate how to implement a sharding solution that is able to correctly retrieve data when given a query that uses subqueries, where one of the queries refers to data that actually resides on a different server.

For example_:
SELECT id, name, email FROM users WHERE id IN (SELECT user_id FROM forum_members WHERE forum_id = ? and user
id > ?)

The table users in this example is sharded whereas forum_members is not. The goal is to transparently retrieve all necessary data in a way that the application does not need to know the specifics about what is sharded and where the actual data resides. This is also a valuable tool for debugging purposes.

Another common problem in large MySQL deployments arises when automatically generated SQL contains an error. Often the standard MySQL error messages are not helpful, particularly when there are incorrect values in IN clauses or parts of
the WHERE clause are missing: The default error message often does not give enough information to discern which query was incorrect.

In these cases it is desirable to override the reported error message with something that explains the actual error and gives more information about the error, such as more context, more locality, or in specific cases like with IN, less locality.

The common error where a NULL value is stringified to an empty string, thus making the query fail to execute, can be handled in several ways:

  • Either by rewriting the query to not containing two consecutive ‘,’ characters in expressions and give a warning,
  • or to give an error that contains better information than:
SELECT * FROM test WHERE a IN (1,,10); […] for the right syntax to use near ‘10)’ at line 1

I will demonstrate both use cases and show strategies to implement custom error reporting.

Legacy applications and/or databases often refer to or contain old schema names, thus requiring either views or modification to work correctly. Sometimes it is not feasible or impossible to convert applications and the transformations on the query might need supporting logic in such a way that a view cannot be used. For such cases I will discuss
solutions and demonstrate sample implementations using on-the-fly query rewriting.

Photo of Kay Roepke

Kay Roepke

Sun Microsystems

Kay Röpke works at Sun Microsystems in the MySQL Enterprise Tools group. He helps maintaining MySQL Proxy, the monitoring agent and other products by this group.

He is also contributes to the ANTLR parser generator project.

Comments on this page are now closed.


Picture of Sheeri K. Cabral
Sheeri K. Cabral
07/30/2009 2:27am PDT

or watch online at youtube (much easier) at

Picture of Sheeri K. Cabral
Sheeri K. Cabral
07/06/2009 9:33am PDT

People have reported that the “play” link loads for 15-20 minutes before actually streaming the content; however, the download link works just fine. I apologize for any inconvenience.

Picture of Sheeri K. Cabral
Sheeri K. Cabral
06/25/2009 12:34am PDT

Play the video directly in your browser at or download it at

Co-presented By:

O'Reilly Media MySQL/Sun Microsystems
  • Kickfire
  • Virident
  • Infobright, Inc
  • JasperSoft
  • Intel
  • Advanced Micro Devices
  • BIRT Exchange by Actuate
  • Calpont
  • Canonical
  • Continuent
  • Dolphin Interconnect Solutions
  • Facebook
  • HiT Software, Inc.
  • IBM
  • iDashboards
  • Oracle
  • Pentaho
  • R1Soft
  • Schooner Information Technology
  • SQLstream
  • Ticketmaster
  • Zmanda, Inc.
  • Linux Journal

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at

Download the MySQL Sponsor/Exhibitor Prospectus

Media Partner Opportunities

Download the Media & Promotional Partner Brochure (PDF) for information on trade opportunities with O'Reilly conferences or contact mediapartners@

Press and Media

For media-related inquiries, contact Maureen Jennings at

MySQL Conference Newsletter

To stay abreast of conference news and to receive email notification when registration opens, please sign up for the MySQL Conference newsletter.

Contact Us

View a complete list of MySQL contacts.