Presented By
O’Reilly + Intel AI
Put AI to Work
April 15-18, 2019
New York, NY

Deep Learning and GPU Acceleration for Data Compression in Time Series Database

Jian Chang (Alibaba Group), Sanjian Chen (Alibaba Group)
1:00pm1:40pm Wednesday, April 17, 2019
Implementing AI
Location: Rendezvous
Secondary topics:  Edge computing and Hardware, Platforms and infrastructure, Reinforcement Learning, Retail and e-commerce, Temporal data and time-series

Who is this presentation for?

Executives/Data Scientist/Business Analyst

Level

Beginner

Prerequisite knowledge

None

What you'll learn

How AI and DL can help with time series data management.

Description

Time series database is of great use for data management in IoT, finance, etc. Alibaba’s TSDB is a time series database that provides effective and economical services to users. So far, we were able to scale our service to thousands of physical nodes and deliver peak performance at 80 million operations per second. Our experiences in building and operating TSDB significantly impact the industry best practice of time series data management. TSDB can help companies in understanding data trends, discovering anomalies, reducing production risks, and increasing productivity and efficiency. We believe the audience can learn valuable experiences from our story to be prepared for the zettabytes-scale IoT world in the years to come.

Nowadays in Alibaba, hundreds of petabytes of time-series data are generated each day. As the data grows rapidly, it becomes a challenge to query such data in a timely manner. We design TSDB to ensure that data compression, decompression, and sorting will be very efficient. By leveraging GPU technologies, we speed up those procedures that involve intensive computation. Based on the TSDB architecture, the data from the same time series and same hour are compressed and stored in the same buffer. Therefore, different buffers can be processed in parallel
using GPU. Experimental results showed a 30-fold speedup was achieved.

Inside Alibaba Group, TSDB is the backbone service for hosting all these data to enable high-concurrency storage and low-latency query, meanwhile provides intelligent analysis capability using AI and other data science technologies. In this talk, we also like to share the design of the Intelligence Engine on Alibaba TSDB that enables fast and complex analytics of large-scale retail data using Deep Learning technologies. We will also demonstrate our work through a successful case study, where we deploy this system to support the Fresh Hema
Supermarket, a major “New Retail” platform operated by Alibaba Group.

We will highlight our solutions to the major technical challenges in data cleaning, storage, and processing. We believe both technical and business audiences will be able to learn valuable experiences and insights from our success story.

Photo of Jian Chang

Jian Chang

Alibaba Group

Data science expert and software system architect with expertise in machine-learning and big-data systems. Rich experiences of leading innovation projects and R&D activities to promote data science best practice within large organizations. Deep domain knowledge on various vertical use cases (Finance, Telco, Healthcare, etc.). Currently working pushing the cutting-edge application of AI at the intersection of high-performance database and IoT, focusing on unleashing the value of spatial-temporal data. I am also a frequent speaker at various technology conferences, including: O’Reilly Strata AI Conference, NVidia GPU Technology Conference, Hadoop Summit, DataWorks Summit, Amazon re:Invent, Global Big Data Conference, Global AI Conference, World IoT Expo, Intel Partner Summit, presenting keynote talks and sharing technology leadership thoughts.

Received my Ph.D. from the Department of Computer and Information Science (CIS), University of Pennsylvania, under the advisory of Professor Insup Lee (ACM Fellow, IEEE Fellow). Published and presented research paper and posters at many top-tier conferences and journals, including: ACM Computing Surveys, ACSAC, CEAS, EuroSec, FGCS, HiCoNS, HSCC, IEEE Systems Journal, MASHUPS, PST, SSS, TRUST, and WiVeC. Served as reviewers for many highly reputable international journals and conferences.

Photo of Sanjian Chen

Sanjian Chen

Alibaba Group

Data scientist with deep knowledge in large-scale machine learning algorithms. Partnered with several Fortune 500 companies and advise the leaderships on making data-driven strategic decisions. Provided software-based data analytics consulting service to 7 global firms across multiple industries, including financial services, automotive, telecommunications, and retail.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)