In the age of big data analytics, smart monitoring and predicting abnormal behavior of corporation mission-critical systems can save large amounts of time and money. Drawing on a real-world case study from EMC, Amihai Savir examines the winding path from idea to viable solution in a corporate environment and walks you through challenges encountered and lessons learned.
EMC’s centralized data science team was established to provide data science services to EMC business units (BUs) and extract actionable insights from the data. To that end, EMC assembled together a group of machine-learning practitioners, statisticians, and mathematicians to develop complex, advanced big data solutions.
The team went on a road show to identify the most promising data science use cases. One such opportunity was an engagement with EMC’s internal IT, whose systems generate millions of entries per second from a large number of subsystems. For example, the authentication environment alone generates 10,000 events per second from more than five major subsystems. With this volume, velocity, and variety of data, meeting IT’s quality of service (QoS) and service-level agreement (SLA) demands is a very challenging task. The team was asked to develop a model capable of predicting when one of the services will fail based on their collective log and performance data.
Amihai offers an overview of the team’s remarkable journey, discussing the multiple phases and development stages as well as the many questions and doubts that arose along the way. Eventually, the project proved a great success, with an expected ROI of $25M/year, and is now running in production for monitoring the MS Exchange and Authentication (ITOA) environments. Amihai shares the team’s experience and insights, which will provide value and a solid knowledge foundation for managers, data scientists, analytics professionals, and IT operations to leverage in order to drive and build data-driven processes.
Amihai also shares some key questions that guided the team, including:
Amihai Savir is a seasoned data scientist and currently leading team of data scientists in EMC. Amihai is also a lecturer at Ben-Gurion University, where he has has taught a variety of subjects including C programing, advanced Java programing, data structures, algorithms, and complexity. Prior to joining EMC, he held several research and development positions in Israeli high-tech companies and in academia, where he focused on various aspects of data science and software engineering. Amihai holds a master’s degree in computer science from Ben-Gurion University, where he specialized in recommender systems and machine learning.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.