September 19–20, 2016: Training
September 20–22, 2016: Tutorials & Conference
New York, NY

I data scienced monitoring data, and so can you

Robert Claire (Pinterest)
1:30pm–2:10pm Thursday, 09/22/2016
Measuring the right things Cloud, DevOps Beekman Audience level: Intermediate
Average rating: **...
(2.86, 7 ratings)

Prerequisite knowledge

  • Familiarity with real-time time series tools such as OpenTSDB, Graphite, InfluxDB, or Ganglia
  • Experience with Python and statistics (useful but not required)
  • What you'll learn

  • Explore practical examples to see that the learning curve in merging the operational and data science worlds is not as steep as it might seem at first
  • Description

    Monitoring data from tools like OpenTSDB is typically used for dashboards and alerts, but applying techniques used in the data science, financial, and scientific computing fields to real-time monitoring data can drive deeper understanding about infrastructure. Rob Claire introduces the monitoring tools Pinterest uses and offers real-world examples of problem solving with data monitoring.

    Rob walks attendees through practical examples of importing and shaping time series data with Python data tools, including pandas and StatsModels, and graphing libraries like Matplotlib. Using Jupyter Notebook examples, Rob then discusses the types of problems that can be solved with this approach, including automatically detecting site performance and availability issues, building a real-time network graph with the NetworkX library, and predicting the rate of disk utilization for an HBase cluster.

    Photo of Robert Claire

    Robert Claire


    Rob Claire is an engineer on the visibility team at Pinterest, where he focuses on extracting insight from real-time operational data. Rob has more than 17 years of experience in the fields of data engineering, DevOps, and performance tuning. His career has included stints at One King’s Lane, Slide, Ning, and eBay.