Presented by O'Reilly and Cloudera
Make Data Work
July 12-13, 2017: Training
July 13-15, 2017: Tutorials & Conference
Beijing, China

基于 Spark 的数据管理、探索、计算平台 (A Spark-based data management, exploration, and computing platform)

此演讲使用中文 (This will be presented in Chinese)

XueMin Zhang (TalkingData)
14:00–14:40 Saturday, 2017-07-15
Spark及更多发展 (Spark & beyond)
Location: 多功能厅8A+8B(Function Room 8A+8B) 观众水平 (Level): Beginner

必要预备知识 (Prerequisite Knowledge)


您将学到什么 (What you'll learn)


描述 (Description)

随着业务的快速发展,数据源及数据量的大幅提升,数据资产管理和数据分析、挖掘工作日趋增多,慢慢的沉淀出了基于Spark, Alluxio、Jenkins等开源技术的数据管理、探索及计算平台。

TalkingData implemented Spark at the end of 2013, and in the time since, all data processing in its data center has been moved to the Spark platform. With the rapid development of TalkingData’s business, its data sources and data volume are increasing significantly. As work for data asset management, data analysis, and data mining also increases, there is need for a data management, exploration, and computing platform based on Spark, Alluxio, Jenkins, and other open source technologies.

XueMin Zhang offers an overview of TalkingData’s platform, sharing the background and evolution of its technical architecture, along with some pitfalls experienced over the course of its use and some follow-up plans.

Photo of XueMin Zhang

XueMin Zhang


6年多软件开发和管理经验,曾在新浪平台架构部担任大数据team leader,负责微博核心数据存储以及大数据计算解决方案,以及在久其、锐安科技担任开发工程师,积累了丰富的软件开发与项目经验,目前就职于TalkingData DTU。专

Connect with O'ReillyData

Use the QR Code to follow OReillyData and get the latest conference information and browse data articles.

WeChat QRcode


Stay Connected Image 1
Stay Connected Image 3
Stay Connected Image 2

Read the latest ideas on big data.

ORB Data Site