Without question, the future of computing promises more scale, more complexity, and certainly more change, all at greater velocity. However, scale, complexity, and change, especially when occurring at ever increasing velocity, are the natural enemies of stability, performance, availability, and reliability.
Many companies have experienced the fear, pain, and embarrassment of handling a technology failure so significant it shook the core of the business both at the time and into the future. Without a standardized way to organize the people responding to incidents and solving technology problems, the time to restore services gets longer and longer.
This session dives into the nuts and bolts of the Incident Management System, which is in use by a number of site reliability teams. Additionally, we describe “how to not let a good crisis go to waste” by learning from each response in productive after action reviews (AAR).
The main points include:
Chris Hawley is deputy program manager on contract managing the International Counterproliferation Program (ICP) of the Defense Threat Reduction Agency (DTRA), the US Department of Defense’s official combat support agency for countering the entire spectrum of chemical, biological, radiological, nuclear, and high-yield explosive threats globally.
Ron Vidal is a partner at Blackrock 3 Partners, a leading incident management firm. Ron’s technology career spans 30 years as a senior executive in critical infrastructure including fiber optic and wireless telecommunications networks, data centers, electric power networks, and oil and gas facilities for Level 3 Communications, MFS Communications, UUNet Technologies, and Kiewit. Ron led teams on $19 billion of M&A transactions and $14 billion of public market financings. Ron managed Level 3’s executive response in New York City after the 9/11 World Trade Center terrorist attack and previously served on Mayor Dinkins’s NYC Task Force on Network Reliability. Ron is a technical peer reviewer for FEMA’s Assistance to Firefighters Grant program and has been a volunteer firefighter in four states. Ron is a member of two working groups on the California Cybersecurity Task Force.
Rob Schnepp is a 30-year veteran of the fire service and retired as the division chief of special operations for the Alameda County, CA, Fire Department. Rob has vast experience in emergency response and served as incident commander on numerous large-scale emergencies. Rob has written two hazardous materials response textbooks and numerous peer-reviewed fire-service-related articles on incident command. He is an instructor at the National Fire Academy and for the US Defense Threat Reduction Agency, providing hazmat/WMD training to an international audience. Rob is a principal in Blackrock 3 Partners, a firm specializing in consulting, training, and war-gaming in the areas of incident management and command.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com