The availability of diverse sources of large volumes of machine-generated network data combined with performance advancements in data retrieval and storage computer platforms is enabling the successful application of statistical learning techniques to explain and predict network phenomenon. Open source programming languages, online training, and visualization tools are accelerating the use of statistical learning. Catalogs of the most common classification and prediction algorithms are abundant and have eliminated the time-consuming need to code the algorithms from scratch.
Data network traffic behavior over time for large service providers is dynamic. Patterns are not always decipherable from observation or summarization alone. The volume of data that can now be collected is uninformative unless it is systematically analyzed. Statistical learning models on data queried from a big data platform are being used to classify characteristics of traffic patterns, using statistical properties of the traffic as model attributes. The output from these models is used to perform forensics on recent historical traffic measurements to detect concurrent anomalies in intraday traffic behavior.
Statistical learning techniques applied to network data provide a comprehensive view of traffic behavior that would not be possible using traditional descriptive statistics alone. Amie Elcan shares an application of the random forest classification method using network data queried from a big data platform and demonstrates how to interpret the model output and the value of the data insight.
Amie Elcan is a principal architect in CenturyLink’s Data Network Strategies organization, where her current areas of focus are traffic modeling, application traffic analytics, and data science. Amie has worked in the telecommunications industry for over 20 years delivering traffic-based assessments that drive optimal network architecture and engineering design decisions.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com