Organizations now run diverse, multidisciplinary big data workloads that span data engineering, data warehousing, and data science applications. Many of these workloads operate on the same underlying data, and the workloads themselves can be transient or long running in nature.
Colm Moynihan, Jonathan Seidman, and Michael Kohs offer a technical deep dive into cloud architecture and explore the challenges of moving to the cloud. You’ll learn what to keep in mind when moving to the cloud and why it may not be as simple as you thought (e.g., data migration and duplication between on-prem and in the cloud). You’ll also dive into core cloud paradigms not present on-premises that drive architecture decisions (e.g., bursting and different cluster lifecycles and tenancy) as well as security best practices in the cloud (e.g., the basics, common pitfalls, and things often overlooked that you need to get right). Along the way, you’ll learn how to manage metadata between various workloads across multiple clusters, both on-premises and in the cloud.
In the second part of the talk, you’ll get your hands dirty as you learn how to successfully set up and run a data pipeline in the cloud that integrates with data engineering and data warehousing workflows, using the Cloudera Altus PaaS offering, powered by Cloudera Altus SDX. You’ll discover considerations and best practices in getting data pipelines running. You’ll also see how to share metadata across workloads in a big data architecture.
Colm Moynihan is partner presales manager in EMEA for Cloudera, where he helps system integrators, ISVs, hardware, cloud partners, resellers, and distributors drive digital transformation into joint customers. Previously, Colm was director of presales in EMEA at Informatica, working with resellers, OEMs, and GSIs to integrate, master, and cleanse customers’ enterprise data. Colm has over 25 years’ experience in development, consulting, finance and banking, startups, and large multinational software companies. Colm holds a master’s degree in distributed computing from Trinity College Dublin.
Jonathan Seidman is a software engineer on the cloud team at Cloudera. Previously, he was a lead engineer on the big data team at Orbitz, helping to build out the Hadoop clusters supporting the data storage and analysis needs of one of the most heavily trafficked sites on the internet. Jonathan is a cofounder of the Chicago Hadoop User Group and the Chicago Big Data Meetup and a frequent speaker on Hadoop and big data at industry conferences such as Hadoop World, Strata, and OSCON. Jonathan is the coauthor of Hadoop Application Architectures from O’Reilly.
Michael Kohs is a product manager at Cloudera.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2019, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com