Hardware platforms in 2015 are very different from 2000, when Google designed their MapReduce and GFS-based data infrastructure. In the last 15 years, there have been memory-centric data systems, but they were mainly utilized for operational data in enterprises. However, at the moment, the fruits of years of research in nonvolatile memory systems are now coming to market.
These exciting new technologies promise lower power consumption and higher density for persistent storage. According to the persistent memory promise, byte-addressability and unprecedented throughputs will become a reality. Will these hardware advances revolutionize the data ecosystem as we know it? How will Hadoop-based data infrastructure adapt to these advances?
In a conversation led by Mesosphere’s Derrick Harris, a compelling panel of data-infrastructure thought leaders, including Micron’s Rob Peglar, Ampool’s Milind Bhandarkar, Cloudera’s Todd Lipcon, and SAP’s Anil Goel, look beyond 2015 to discuss the possibilities of data infrastructure in the future.
Derrick Harris works for datacenter software startup Mesosphere. He was previously a technology journalist, most notably covering cloud computing, big data, and other emerging IT trends for Gigaom since 2009. There’s a strong possibility that Derrick has written the words “cloud” and “Hadoop” more than any other person on the planet. Derrick lives in Las Vegas and has a law degree from the University of Nevada, Las Vegas. Away from the office, Derrick trains in Muay Thai and is active in animal-welfare issues.
Robert Peglar is vice president of advanced storage solutions at Micron Technology. A 38-year industry veteran and published author, Robert leads efforts in advanced storage systems strategy, leads executive-level planning with key customers and partners worldwide for Micron’s Storage Business Unit, and defines future storage portfolio offerings. Prior to joining Micron, Robert was CTO of the Americas at EMC Isilon and a senior fellow and vice president of technology at Xiotech and held key technology specialist and engineering leadership positions over a 10-year period at StorageTek and its subsidiary, Network Systems Corporation. Prior to StorageTek, he held engineering development and product management positions for a decade at Control Data and its supercomputer division, ETA Systems.
Robert serves on the board of directors of the SNIA, is the former cochair of the SNIA Analytics and Big Data Committee and the SNIA Tutorials, and is the former director of the SNIA Solid State Storage Initiative. He also serves as an advisor to the Flash Memory Summit. Robert has extensive experience in data management and analysis, high-performance computing, nonvolatile memory, distributed cluster architectures, filesystems, I/O performance optimization, cloud storage, replication and archiving strategy, networking protocols, and storage virtualization. He is a sought-after speaker and panelist at leading storage and cloud-related seminars and conferences worldwide. Robert was named an EMC Elect in 2014 and 2015. He was also one of 25 senior executives worldwide selected for the CRN Storage Superstars Award in 2010. He holds a BS degree in computer science from Washington University in St. Louis and performed graduate work at Washington University’s Sever Institute of Engineering. His research background includes memory optimization, distributed systems, I/O performance analysis, queuing theory, parallel systems architecture and OS design, filesystems and storage networking protocols, clustering algorithms and virtual systems communication.
Milind Bhandarkar was the founding member of the team at Yahoo that took Apache Hadoop from 20-node prototype to data center-scale production system and has been contributing and working with Hadoop since version 0.1.0. Milind started the Yahoo Grid solutions team focused on training, consulting, and supporting hundreds of new migrants to Hadoop. Parallel programming languages and paradigms has been his area of focus for over 20 years. He worked at the Center for Development of Advanced Computing (C-DAC), the National Center for Supercomputing Applications (NCSA), the Center for Simulation of Advanced Rockets, Siebel Systems, Pathscale Inc. (acquired by QLogic), Yahoo, and LinkedIn. Previously, Milind served as the chief architect at Greenplum Labs, a division of EMC, and the chief scientist at Pivotal, a new EMC joint venture with VMware. Milind holds a PhD degree in computer science from the University of Illinois at Urbana-Champaign.
Richard Probst is VP of infrastructure technology strategy at SAP and is currently focused on working with SAP partners on innovative cloud architectures for SAP application landscapes to help SAP customers become more agile.
Todd Lipcon is an engineer at Cloudera, where he primarily contributes to open source distributed systems in the Apache Hadoop ecosystem. Previously, he focused on Apache HBase, HDFS, and MapReduce, where he designed and implemented redundant metadata storage for the NameNode (QuorumJournalManager), ZooKeeper-based automatic failover, and numerous performance, durability, and stability improvements. In 2012, Todd founded the Apache Kudu project and has spent the last three years leading this team. Todd is a committer and PMC member on Apache HBase, Hadoop, Thrift, and Kudu, as well as a member of the Apache Software Foundation. Prior to Cloudera, Todd worked on web infrastructure at several startups and researched novel machine learning methods for collaborative filtering. Todd holds a bachelor’s degree with honors from Brown University.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.