Although normal is an ambiguous term, many operations, such as electricity use, computer operations, and network traffic, depend on an understanding of normal conditions. Recognizing abnormalities can raise a concern before it becomes critical and help identify unusual situations worthy of investigation. However, normal (and abnormal) is often hard to nail down. There is no universal normal that applies under all conditions across all time. For example, there are often high- and low-activity periods along with many possible legitimate exceptions, and over time significant changes can occur.
John Hebeler offers an overview of two deep learning methods to determine normal behavior, which when combined further improve performance. In the first, a recurrent neural network (RNN) forms a probability prediction envelop for the next activity. Actual data that is outside of the prediction envelope is deemed abnormal. However, if the prediction envelope is too large, abnormal activities are difficult to recognize. The second method, the self-organizing map (SOM), groups activities providing a level of granularity that reflects the various states of normal. The resulting map model consumes future data to determine its group category. Actual data outside of the various groups define an abnormality. When these methods are used together in an ensemble of deep learning, the SOM determines groupings while the RNN predicts the likelihood of group association. This approach creates multiple envelopes to predict normal, thus improving its definition of normal operations. Both models can evolve due to changes with additional new data or updated classifications and as a result update the normal definition.
The technology stack and corresponding code is open source and available at GitHub. John outlines the design and implementation of the three approaches and demonstrates them using a data stream consisting of multiple sensors. Along the way, he covers the pros and cons of each.
John Hebeler is the chief data scientist and principal engineer for the RMS Division of Lockheed Martin, where he just finished a five-year program to analyze large, diverse data streams to form complex policy determinations in a big data event-driven architecture. John holds three patents and is the coauthor of two technical books on networking and data semantics. He presents at technical and business conferences throughout the world. Previously, he served as an adjunct professor for both Loyola University and University of Maryland. John holds a BS in electrical engineering, an MBA, and a PhD in information systems. In his free time, he’s an avid tennis player and beer brewer.
Comments on this page are now closed.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org