数据工程和架构 (Data engineering and architecture), 英文讲话 (Presented in English)
数据平台 (Data Platform), 物流 (Logistics)
使用大数据推动东南亚前行 (Driving Southeast Asia forward with big data）
Feng Cheng (Grab), Edwin Law (Grab)
Grab is sitting at the junction of the digital and physical worlds. Its vision is to drive Southeast Asia forward and transform the way people travel and pay across the region. Feng Cheng and Edwin Law explain Grab's data architecture and offer a history of its data platform migration and stream-processing apps.
Damon Deng provides a short background on deep learning, focusing on relevant application domains, and offers an introduction to using the powerful and scalable deep learning framework MXNet. Join in to learn how MXNet works and how you can spin up AWS GPU clusters to train at record speeds.
使用BigDL在Apache Spark上进行大规模分布式深度学习 (Distributed deep learning at scale on Apache Spark with BigDL)
Zhichao Li (Intel), Shengsheng Huang (Intel), Yiheng Wang (Intel)
Zhichao Li, Shengsheng Huang, and Yiheng Wanghow explore how data scientists have adopted BigDL for deep learning analysis on large amounts of data in a distributed fashion, allowing them to use their big data cluster as a unified data analytics platform for data storage, data processing and mining, feature engineering, traditional (non-deep) machine learning, and deep learning workloads.
Apache Hadoop 3.0的特性和开发进展的更新 (Apache Hadoop 3.0 features and development update)
Andrew Wang (Cloudera), Daniel Templeton (Cloudera)
Apache Hadoop 3.0 has made steady progress toward a planned release this year. Andrew Wang and Daniel Templeton offer an overview of new features, including HDFS erasure coding, YARN Timeline Service v2, and MapReduce task-level optimization, and discuss current release management status and community testing efforts dedicated to making Hadoop 3.0 the best Hadoop major release yet.
安全 (Security), 英文讲话 (Presented in English)
在京东利用大数据进行安全分析 (Leveraging big data for security analytics at JD)
Jimmy Zhigang Su (JD.COM), Tony Lee (JD.com)
JD.com is one of the largest B2C online retailers in the world. Its mission is to provide a safe and secure marketplace for its 226M active users and 120K third-party vendors. Jimmy Zhigang Su and Tony Lee discuss the transformations big data has enabled at JD, including threat intelligence, account security, and end-point security.
成为Apache Spark明星路上的技巧 (Tricks of the trade to be an Apache Spark rock star）
Ted Malaska (Capital One)
It's one thing to write an Apache Spark application that gets you to an answer. It’s another thing to know you used all the tricks in the book to make it run as fast as possible. Ted Malaska shares some of those tricks.
在Apache Hadoop和Spark上加速大数据加密 (Speed up big data encryption in Apache Hadoop and Spark）
Haifeng Chen (Intel)
Although the processing capability of modern platforms is approaching memory speed, securing big data using encryption still hurts performance. Haifeng Chen shares proven ways to speed up data encryption in Hadoop and Spark, as well as the latest progress in open source, and demystifies using hardware acceleration technology to protecting your data.
使用R和Apache Spark处理大规模数据 (Scaling R faster and larger using Apache Spark)
Xiaoyong Zhu (Microsoft)
R is a popular data science tool for data analysis. However, it has many drawbacks, such as its memory utilization and single-thread design, that limit its usage for big data analysis. Xiaoyong Zhu explains how to use R to analyze terabytes of data.
大数据时代银行客户社交关系圈研究与应用 (Research on and the application of a social relation circle of bank customers in the big data era)
为加深对银行客户的洞察，提升银行营销获客与风险管控能力，广发银行基于Hadoop大数据平台，通过Hive on Spark、图计算进行数据加工，结合LFM社群发现、增强决策树等机器学习算法构建了银行客户社交关系模型，挖掘出银行客户社交关系圈，并应用于银行实际业务中。银行客户社交关系圈全面的反映了银行个人客户资金、社交等关系，以全新的视角实现银行对客户洞察从点到面、从单客到客群的扩展，填补银行个人客户社交关系研究与应用的空白。
生产环境里的堆外内存HBase读路径——阿里巴巴的故事 (Off-heap HBase read path in production: The Alibaba story)
Yu Li (Alibaba), Ramkrishna Vasudevan (Intel)
Yu Li explains how Alibaba met the challenge of tens of millions requests per second to its Alibaba-Search HBase cluster on 2016 Singles' Day. With read-path off-heaping, Alibaba improved the throughput by 30% and achieved a predicable latency.
Mathieu Dumoulin and Mateusz Dymczyk walk you step by step through building a scalable, real-time anomaly detection pipeline applied to an industrial robot. You'll learn how to gather data from a wireless movement sensor, process it with H2O on a MapR cluster, and visualize the output through an AR headset by an operator.
英文讲话 (Presented in English), 赞助商赞助 (Sponsored)
HDF 3.0: 轻松使用的开源物流网平台 - Hortonworks赞助议题（HDF 3.0: An open source IoT platform for everyone—sponsored by Hortonworks）
Yifeng Jiang (Hortonworks)
Yifeng Jiang offers an overview of HDF 3.0, the open source IoT platform that everyone can easily start using right now. HDF supports data collection from the edge, flow management to send data to the data center and the cloud, real-time processing, and visualization and analytics with open source technology and can be used with simple drag-and-drop operations.
使用开源人工智能和机器学习工具训练现实世界的信用模型（Training a real-world credit model using open source artificial intelligence and machine learning tools）
Michael Li (The Data Incubator)
Michael Li demonstrates how to iteratively train and refine a simple yet robust credit model for loan-default prediction, based on real-world loan performance data using 100% open source machine learning and artificial intelligence tools. The data is based on US$26 billion in loans issued over 10 years.
Jumpy：一个曾经没有的深度学习的JVM接口 (Jumpy: The missing JVM interface for deep learning）
Adam Gibson (Konduit)
Adam Gibson offers a high-level overview of jumpy, a better Python interface for deep learning applications, and explains why Spark's Py4J interface for deep learning makes it impractical for deep learning applications.
Ben Lorica (O'Reilly), Doug Cutting (Cloudera), Jason (Jinquan) Dai (Intel)
大会日程主席 Ben Lorica、Jason Dai 与 Doug Cutting致辞开始第一天主题演讲。
英文讲话 (Presented in English)
驱动金融服务的可能性 (Powering possibilities in financial services）
Mick Hollison (Cloudera), Jien Zhou (UnionPay)
Mick Hollison and Jien Zhou discuss how organizations are applying machine learning and advanced analytics to improve customer service and reduce the threat of fraud and cyberattack and explain how China UnionPay is using big data to deliver a better customer experience and manage risk.
成长的烦恼--领英大数据平台500倍扩展中应对的挑战 (Growing pains: When your big data platform grows really big)
发生在腾讯AI实验室里的大数据研究（Big data research at Tencent AI Lab）
Han Liu (Tencent AI Lab)
电子商务的未来：AI和大数据（An ecommerce future: AI and big data）
Dennis Weng (JD Group)
Online shopping accounts for over 15% of China's overall shopping market and has been growing more than 20% every year. Over the past 13 years, JD has successfully become a direct sale online retail giant. Dennis Weng explains how JD has used rich and high-value customer and business data to become one of the most important data companies in China.
大数据在滴滴出行的应用 (Big data at DiDi Chuxing)
叶杰平 (Ye Jieping) (滴滴出行)
Every day, Didi Chuxing's platform generates over 70 TB worth of data, processes more than 20 billion routing requests, and produces over 14 billion location points. Ye Jieping explains how Didi Chuxing applies AI technologies to analyze such big transportation data and improve the travel experience for people in China.
上午茶歇 (Morning Break)
下午茶歇 (Afternoon Break)
来宾招待会 (Attendee Reception)
喝着饮料和Strata Data Conference来宾交流，了解一下数据领域领先公司的最新技术和产品。
11:55-13:10 (1h 15m)
周五午餐行业桌会及午餐，由Intel赞助 (Friday Industry Tables and lunch sponsored by Intel)