Building OpenDNS Stats

Operations
Location: Regency 1 Level: Intermediate
Average rating: ***..
(3.68, 19 ratings)

The old OpenDNS Stats system was built when we were doing 1 billion queries a day and had far outlived its usefulness. Playing hot potato with load on overworked servers all struggling to keep up gets old after a while, doesn’t it? This gave me the opportunity to start from a blank slate and build the system we need to serve us at 8 billion queries a day and scale to 16 or 24 billion. We considered writing another set of PHP shell scripts, we considered working with Hadoop and we considered non-MySQL data storage options. I’ll explain why, in the end, I chose a custom map-reduce-esque implementation in PHP and C++, using MySQL for persistant storage.

While I’m proud that my initial design withstood the test of implementation, there were of course false starts and wrong turns. In the talk I’ll detail three problems and the solutions that got me to production.

  • False start #1: my new-guy lack of understanding of the old system combined with my starry-eyed desire to use some new technology (Thrift in this case). Hoping to reduce disk I/O concerns, I started throwing log files around line-by-line using a Thrift service. The results were massive network congestion and a more difficult path to failure recovery.
  • False start #2: I’ll go through another case of small software design tweaks making a huge difference in MySQL performance when tuning InnoDB isn’t enough.
  • False start #3: When you’re aggregating anything in memory, you’ve got to expect std::bad_alloc to come around eventually. At first, I tried to tally memory usage and proactively free some when necessary but found this to be inaccurate and crash-prone. The production version can gracefully handle these memory exceptions without data loss.

After the tour of some of the implementation challenges, I’ll walk through the architecture of the entire production system from DNS servers through the map-reduce pipeline to databases and onto the website for all of our users. I’ll also share some of the (less common) tools I found indispensible.

Photo of Richard Crowley

Richard Crowley

Slack

Richard started his career as part of the Yahoo! Intern Class of 2006 and was subsequently offered a position at Flickr. After building the Flickr Uploadr, today used by millions of Flickr users around the world, Richard left Flickr to join OpenDNS, the world’s largest and fastest-growing DNS provider. At OpenDNS, Richard leads engineering on backend systems, namely the DNS Stats processing system which handles more than 8 billion DNS queries daily. Richard is a 2007 graduate of Washington University in Saint Louis, where he earned Bachelor’s Degrees in Computer Engineering and Computer Science.

  • Keynote Systems
  • Google
  • Shopzilla
  • Aptimize
  • Facebook
  • NeuStar
  • Rackspace Cloud
  • Schooner Information Technology
  • SoftLayer
  • SpringSource
  • Sun Microsystems

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at scordesse@oreilly.com

Download the Velocity Sponsor/Exhibitor Prospectus

Media Partner Opportunities

Download the Media & Promotional Partner Brochure (PDF) for information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Velocity Newsletter

To stay abreast of conference news and to receive email notification when registration opens, please sign up for the Velocity Conference newsletter (login required)

Contact Us

View a complete list of Velocity contacts