Presented By O'Reilly and Cloudera
Make Data Work
December 1–3, 2015 • Singapore
 

Click buttons to filter by type

  • Events
  • Tutorials
  • Training
  • Keynotes
  • Office Hours
  • 321-322
    Add Hadoop application architectures: Fraud detection to your personal schedule
    9:00am Hadoop application architectures: Fraud detection Gwen Shapira (Confluent), Ted Malaska (Blizzard Entertainment), Mark Grover (Lyft), Jonathan Seidman (Cloudera)
    Add Apache Hadoop operations for production systems to your personal schedule
    1:30pm Apache Hadoop operations for production systems Kathleen Ting (Cloudera), Jonathan Hsieh (Cloudera, Inc), Philip Langdale (Cloudera), Kostas Sakellis (Cloudera)
    324
    Add Data science for Telecom  to your personal schedule
    9:00am Data science for Telecom Juliet Hougland (Cloudera), Sandy Ryza (Cloudera)
    Add An Introduction to time series with Team Apache to your personal schedule
    1:30pm An Introduction to time series with Team Apache Patrick McFadin (DataStax)
    331
    Add Machine learning In Python with scikit-learn to your personal schedule
    9:00am Machine learning In Python with scikit-learn Andreas Mueller (NYU, scikit-learn)
    Add Developing a modern enterprise data strategy to your personal schedule
    1:30pm Developing a modern enterprise data strategy Edd Wilder-James (Silicon Valley Data Science), John Akred (Silicon Valley Data Science)
    334
    Add Deploying models with Azure Machine Learning to your personal schedule
    1:30pm Deploying models with Azure Machine Learning Danielle Dean (Microsoft), Wee Hyong Tok (Microsoft)
    328-329
    Add Spark Camp: Exploring Wikipedia with Spark (Tackling a unified use case) to your personal schedule
    9:00am Sponsored by Huawei
    Spark Camp: Exploring Wikipedia with Spark (Tackling a unified use case) Sameer Farooqui (Databricks), Paco Nathan (O'Reilly Media), Reynold Xin (Databricks)
    335
    Add Cloudera essentials for Apache Hadoop to your personal schedule
    9:00am Cloudera essentials for Apache Hadoop Wing Leong Ho (CLOUDERA)
    12:30pm Lunch 12:30pm - 1:30pm (Nicoll 1-2) | Afternoon Break 3:00pm - 3:30 (The Link)
    Room: Nicoll 1-2
    7:30am Coffee Break 7:30am - 9:00am | Morning Break 10:30am - 11:00am
    Room: The Link
    Add 20x20 Talks, Powered by PechaKucha to your personal schedule
    5:00pm Plenary
    Room: Summit 1-2
    20x20 Talks, Powered by PechaKucha
    9:00am-12:30pm (3h 30m) Hadoop Platform
    Hadoop application architectures: Fraud detection
    Gwen Shapira (Confluent), Ted Malaska (Blizzard Entertainment), Mark Grover (Lyft), Jonathan Seidman (Cloudera)
    Looking for a deeper understanding of how to architect real-time data processing solutions? This tutorial will provide this understanding using a real-world example of a fraud detection system. We’ll use this example to discuss considerations for building such a system, how you’d integrate various technologies, and why those choices make sense for the use case in question.
    1:30pm-5:00pm (3h 30m) Production-ready Hadoop
    Apache Hadoop operations for production systems
    Kathleen Ting (Cloudera), Jonathan Hsieh (Cloudera, Inc), Philip Langdale (Cloudera), Kostas Sakellis (Cloudera)
    Hadoop is emerging as the standard for big data processing and analytics. However, as usage of Hadoop clusters grow, so do the demands of managing and monitoring these systems. In this tutorial, attendees will get an overview of all phases of successfully managing Hadoop clusters, with an emphasis on production systems.
    9:00am-12:30pm (3h 30m) Data Science and Advanced Analytics
    Data science for Telecom
    Juliet Hougland (Cloudera), Sandy Ryza (Cloudera)
    In this half-day tutorial, attendees will get a taste of how large-scale data science techniques and technologies developed for the consumer internet can be applied in the world of Telecom.
    1:30pm-5:00pm (3h 30m) IoT and Real-time
    An Introduction to time series with Team Apache
    Patrick McFadin (DataStax)
    This tutorial is all about managing large volumes of data coming at your data center fast and continuously. If you don't have a strategy, then allow me to help. Amazing Apache Project software can make this problem a lot easier to deal with. Spend a few hours and learn about how each part works, and how they work together. Your users will thank you.
    9:00am-12:30pm (3h 30m) Data Science and Advanced Analytics
    Machine learning In Python with scikit-learn
    Andreas Mueller (NYU, scikit-learn)
    This talk is a tutorial for the machine learning library scikit-learn in Python. It starts with a short introduction into what machine learning is, and then dives in-depth into how to use scikit-learn in practice. The tutorial will be in the format of an IPython notebook and includes exercises.
    1:30pm-5:00pm (3h 30m) Data-driven Business
    Developing a modern enterprise data strategy
    Edd Wilder-James (Silicon Valley Data Science), John Akred (Silicon Valley Data Science)
    Big data and data science have great potential for accelerating business, but how do you reconcile the opportunity with the sea of possible technologies? Conventional data strategy has little to guide us, focusing more on governance than on creating new value. In this tutorial, we explain how to create a modern data strategy that powers data-driven business.
    9:00am-12:30pm (3h 30m) Data Science and Advanced Analytics
    Interactive data visualization with Lightning: Using d3, Seaborn, and R
    Matthew Conlen (FiveThirtyEight)
    This session teaches use of modern data analysis and visualization tools for effective interactive data science. Attendees will learn how to use notebook environments to set up sharable and reproducible analysis pipelines, and will leverage tools for large scale analysis and web-based data visualization to drive further analysis and decision making.
    1:30pm-5:00pm (3h 30m) Data Science and Advanced Analytics
    Deploying models with Azure Machine Learning
    Danielle Dean (Microsoft), Wee Hyong Tok (Microsoft)
    In this tutorial, you will create end-to-end predictive models based on an extensive library of machine learning algorithms included in Microsoft Azure Machine Learning studio with its R and Python language extensibility. You will then deploy and consume the model and use it for making predictions over business data.
    9:00am-5:00pm (8h) Spark & Beyond
    Spark Camp: Exploring Wikipedia with Spark (Tackling a unified use case)
    Sameer Farooqui (Databricks), Paco Nathan (O'Reilly Media), Reynold Xin (Databricks)
    The real power and value proposition of Apache Spark is in building a unified use case that combines ETL, batch analytics, real-time stream analysis, machine learning, graph processing and visualizations. In class we will explore various Wikipedia datasets while applying the ideal programming paradigm for each analysis. The class will comprise of about 50% lecture and 50% hands on labs + demos.
    9:00am-5:00pm (8h) Training
    Cloudera essentials for Apache Hadoop
    Wing Leong Ho (CLOUDERA)
    Cloudera University's one-day essentials course presents an overview of Apache Hadoop and how it can help decision-makers meet business goals, providing a fundamental introduction to the main components of Hadoop and its use cases in various industries. This course is a good starting point for any role or set of objectives and is part of the data analyst learning path.
    12:30pm-1:30pm (1h)
    Break: Lunch 12:30pm - 1:30pm (Nicoll 1-2) | Afternoon Break 3:00pm - 3:30 (The Link)
    7:30am-9:00am (1h 30m)
    Break: Coffee Break 7:30am - 9:00am | Morning Break 10:30am - 11:00am
    5:00pm-6:30pm (1h 30m) Event
    20x20 Talks, Powered by PechaKucha
    PechaKucha 20x20 is a simple presentation format where you show 20 images, each for 20 seconds. The images advance automatically and you talk along to the images.