Skip to main content

One Size Does Not Fit All: Analyzing Data at Scale with AWS

Rahul Pathak (Amazon Web Services)
Hadoop and Beyond
GA Ballroom J
Average rating: ****.
(4.00, 3 ratings)

Learn how AWS thinks about big data and how we and our customers have approached managing large datasets using services such as Amazon S3, Amazon Elastic MapReduce, Amazon DynamoDB, and Amazon Redshift. We’ll discuss the key elements of good big data architecture, including elastically growing your resources as you need them, using the right tools for the job, and paying only for what you use. We’ll provide a holistic view of how you can leverage AWS tools to support your structured and unstructured data stack through specific examples from diverse customers.

Photo of Rahul Pathak

Rahul Pathak

Amazon EMR, Amazon Web Services

Rahul Pathak runs the Amazon EMR and AWS Data Pipeline businesses for AWS. Amazon EMR is a web service for running frameworks like Hadoop, Spark, and Presto on managed clusters in the cloud. AWS Data Pipeline is web service for orchestrating data flows between services and data stores. During his time at AWS Rahul has focused on managed data and analytic services. Prior to EMR and Data Pipeline, he was the Principal Product Manager for Amazon Redshift, a fast, fully managed, petabyte-scale data warehouse service in the cloud. He has also worked on Amazon ElastiCache, Amazon RDS, and Amazon RDS Provisioned IOPS. Rahul has over fifteen years of experience in technology and has co-founded two companies, one focused on digital media analytics and the other on IP-geolocation. He holds a degree in Computer Science from MIT and an Executive MBA from the University of Washington.