There’s been a huge amount of progress in recent years in developing distributed systems that are resilient to all sorts of faults. However, there’s one critical category of errors that has largely been ignored: human error. The scope and potential impact of human error is massive: deployed bugs, accidentally deleting data, accidentally DDOS’ng important internal services, and so on. Designing for human fault-tolerance leads to important conclusions on the fundamental ways data systems should be architected.
Nathan Marz is the lead engineer on Twitter’s Publisher Analytics team. He was previously the lead engineer at BackType before being acquired by Twitter in July of 2011.
Nathan is the author of numerous open-source projects relied upon by companies all around the world. These include Cascalog, ElephantDB, and Storm.
He has spoken about his work at conferences such as the Hadoop Summit, Strange Loop, Gluecon, Clojure/conj, and POSSCON. He writes a blog at http://nathanmarz.com.
For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at email@example.com
For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
For media-related inquiries, contact Maureen Jennings at firstname.lastname@example.org
View a complete list of Strata contacts