Sep 23–26, 2019

Schedule: Data Management and Storage sessions

Add to your personal schedule
9:00am12:30pm Tuesday, September 24, 2019
Location: 1E 14
Jason Wang (Cloudera), Tony Wu (Cloudera), Vinithra Varadharajan (Cloudera)
Moving to the cloud poses challenges from re-architecting to be cloud-native, to data context consistency across workloads that span multiple clusters on-prem and in the cloud. First, we’ll cover in depth cloud architecture and challenges; second, you’ll use Cloudera Altus to build data warehousing and data engineering clusters and run workloads that share metadata between them using Cloudera SDX. Read more.
Add to your personal schedule
9:00am12:30pm Tuesday, September 24, 2019
Location: 1E 12/13
Matt Fuller (Starburst)
Used by Facebook, Netflix, Airbnb, LinkedIn, Twitter, Uber, and others, Presto has become the ubiquitous open source software for SQL on anything. Presto was built from the ground up for fast interactive SQL analytics against disparate data sources ranging in size from GBs to PBs. Join Matt Fuller to learn how to use Presto and explore use cases and best practices you can implement today. Read more.
Add to your personal schedule
1:30pm5:00pm Tuesday, September 24, 2019
Location: 1E 12/13
Gowrishankar Balasubramanian (Amazon Web Services), Rajeev Srinivasan (Amazon Web Services)
Enterprises adopt Cloud platforms such as AWS for agility, elasticity and cost savings. Database design and management requires a different mindset in AWS when compared to traditional RDBMS design. In this session, you will learn important considerations in choosing the right database based on your use cases and access pattern while migrating an application or building a new application on cloud. Read more.
Add to your personal schedule
11:20am12:00pm Wednesday, September 25, 2019
Location: 1E 09
Steven Touw (Immuta)
Anti-patterns are behaviors that take bad problems and lead to even worse solutions. In the world of data security and privacy, they’re everywhere. Over the past 4 years we’ve seen data security and privacy anti-patterns consistently emerge across 100s of customers and industry verticals - there has been an obvious trend. We’ll cover 5 anti-patterns and more importantly, the solutions for them. Read more.
Add to your personal schedule
11:20am12:00pm Wednesday, September 25, 2019
Location: 1E 07/08
Paige Roberts (Vertica), Deepak Majeti (Vertica)
a. Analytics experts, GoodData, needed to auto-recover from node failures and scale rapidly when workloads spike on their MPP database in the cloud. Kubernetes could solve that, but K8 is for stateless micro-services, not a stateful MPP database that needs 100s of containers. In order to merge the power of an MPP database with the flexibility of Kubernetes, a lot of hurdles had to be overcome. Read more.
Add to your personal schedule
1:15pm1:55pm Wednesday, September 25, 2019
Location: 1E 10/11
Michael Stonebraker (Tamr, Inc.)
As a steward for your enterprise’s data and digital transformation initiatives, you’re tasked with making the right choice. But before you can make those decisions, it’s important to understand what NOT to do when planning for your organization’s Big Data initiatives. Dr Michael Stonebraker, Adjunct Professor, MIT, & Co-Founder/CTO, Tamr will discuss his Top 10 Big Data Blunders. Read more.
Add to your personal schedule
4:35pm5:15pm Wednesday, September 25, 2019
Location: 1E 14
Elasticsearch allows extremely quick search and drilldowns on large amounts of semistructured data. Elasticsearch, however, does not have relational join capabilities. In this presentation I'll introduce a plugin for ES that adds cluster distributed joins and demonstrate how it enables an exciting array of use cases dealing with interconnected or "Knowledge Graph" enterprise data. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 26, 2019
Location: 1E 09
Rick Houlihan (Amazon Web Services)
Data has always been relational, and it always will be. NoSQL databases are gaining in popularity, but that does not change the fact that the data they manage is still relational, it just changes how we have to model the data. This session dives deep into how real Entity Relationship Models can be efficiently modeled in a denormalized manner using schema examples from real application services. Read more.
Add to your personal schedule
11:20am12:00pm Thursday, September 26, 2019
Location: 1A 15/16
Michael Freedman (TimescaleDB)
Leveraging polyglot solutions for your time-series data can lead to a variety of issues including engineering complexity, operational challenges, and even referential integrity concerns. By re-engineering Postgres to serve as a general data platform, your high-volume time-series workloads will be better streamlined, resulting in more actionable data and greater ease of use. Read more.
Add to your personal schedule
1:15pm1:55pm Thursday, September 26, 2019
Location: 1A 15/16
Alon Gavra (AppsFlyer)
Kafka, many times is just a piece of the stack that lives in production that often times no one wants to touch - because it just works. At AppsFlyer, Kafka sits at the core of our infrastructure that processes billions of events daily. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 26, 2019
Location: 1A 23/24
Reza Shiftehfar (Uber Technologies)
Building a reliable Big Data platform is extremely challenging when it has to store and serve 100s of PetaBytes of data in a real-time fashion . This talk reflects on the challenges faced and proposes architectural solutions to scale a Big Data Platform to ingest, store, and serve 100+ PB of data with minute level latency while efficiently utilizing the hardware and meeting the security needs. Read more.
Add to your personal schedule
2:05pm2:45pm Thursday, September 26, 2019
Location: 1E 09
Tomer Shiran (Dremio), Jacques Nadeau (Dremio)
Data lakes have become a key ingredient in the data architecture of most companies. In the cloud, object storage systems such as S3 and ADLS make it easier than ever to operate a data lake. In this talk we describe how companies can build best-in-class data lakes in the cloud, leveraging open source technologies and the cloud's elasticity to run and optimize various workloads simultaneously. Read more.
Add to your personal schedule
3:45pm4:25pm Thursday, September 26, 2019
Location: 1A 23/24
Vitaliy Baklikov (Development Bank of Singapore), Dipti Borkar (Alluxio )
In this presentation, Vitaliy Baklikov from DBS Bank and Dipti Borkar from Alluxio will share how DBS Bank has built a modern big data analytics stack leveraging an object store even for data-intensive workloads like ATM forecasting and how it uses Alluxio to orchestrate data locality and data access for Spark workloads. Read more.

    Contact us

    confreg@oreilly.com

    For conference registration information and customer service

    partners@oreilly.com

    For more information on community discounts and trade opportunities with O’Reilly conferences

    strataconf@oreilly.com

    For information on exhibiting or sponsoring a conference

    Contact list

    View a complete list of Strata Data Conference contacts