The data ecosystem has matured to the point where the pluses and minuses of distributed systems are clear, and best practices for building and managing both the human and the technical side of a big data program have emerged. Now is the time to step back and take a fresh look at the options. Whether starting a data science program, reaching the breaking point with your current data technology, or figuring out what the competition is up to, these sessions will give you a bird’s-eye view of data technologies, techniques, and data-driven organizations.
Marie Beaugureau is the lead data editor for O’Reilly Media.
Paco Nathan is known as a “player/coach” with core expertise in data science, natural language processing, machine learning, and cloud computing. He has 35+ years of experience in the tech industry, at companies ranging from Bell Labs to early-stage startups. His recent roles include director of the Learning Group at O’Reilly and director of community evangelism at Databricks and Apache Spark. Paco is the cochair of Rev conference and an advisor for Amplify Partners, Deep Learning Analytics, Recognai, and Primer. He was named one of the "top 30 people in big data and analytics" in 2015 by Innovation Enterprise.
Tim Berglund is the senior director of developer experience with Confluent, where he serves as a teacher, author, and technology leader. Tim can frequently be found speaking at conferences internationally and in the United States. He’s the copresenter of various O’Reilly training videos on topics ranging from Git to distributed systems and is the author of Gradle Beyond the Basics. He tweets as @tlberglund, blogs very occasionally at Timberglund.com, and is the cohost of the DevRel Radio podcast. He lives in Littleton, Colorado, with the wife of his youth and their youngest child, the other two having mostly grown up.
Edd Wilder-James is a strategist at Google, where he is helping build a strong and vital open source community around TensorFlow. A technology analyst, writer, and entrepreneur based in California, Edd previously helped transform businesses with data as vice president of strategy for Silicon Valley Data Science. Formerly Edd Dumbill, Edd was the founding program chair for the O’Reilly Strata Data Conference and chaired the Open Source Software Conference for six years. He was also the founding editor of the peer-reviewed journal Big Data. A startup veteran, Edd was the founder and creator of the Expectnation conference management system and a cofounder of the Pharmalicensing online intellectual property exchange. An advocate and contributor to open source software, Edd has contributed to various projects such as Debian and GNOME and created the DOAP vocabulary for describing software projects. Edd has written four books, including Learning Rails (O’Reilly).
Matthew Gee is cofounder and principal at the Impact Lab, a data-analytics company focused exclusively on developing scalable data science solutions to social-sector problems. He is also a senior research scientist at the University of Chicago’s Center for Data Science and Public Policy and a research fellow at the Urban Center for Computation and Data. Matt is the cofounder of the Eric and Wendy Schmidt Data Science for Social Good fellowship, which in its first three years has paired 126 fellows with over 40 national, state, and local government organizations and NGOs to build data-driven solutions to social problems.
Matt’s applied work focuses on combining methods and problems from the social sciences with machine-learning methods and new data sources to drive operational efficiency and individual behavior change and to implement adaptive policy interventions, with a focus on energy use, sustainable development, urban systems, and local labor market dynamics. He has lead major data science initiatives with large public-sector clients, including the World Bank, national governments and agencies (Mexico, USA), state governments (California, Illinois), and cities (San Francisco, Chicago, Memphis), as well as large nonprofit organizations and for-profit companies. Matt serves as an advisor to Code for America, DataKind, and the Chicago School of Data and is a member of the World Bank’s Partnership for Open Data. He has previously worked at the US Treasury’s Office of Energy and Environment and has founded several companies focused on analytics, energy, and finance.
Yael Garten is director of data science at LinkedIn, where she leads a team that focuses on understanding and increasing growth and engagement of LinkedIn’s 400 million members across mobile and desktop consumer products. Yael is an expert at converting data into actionable product and business insights that impact strategy. Her team partners with product, engineering, design, and marketing to optimize the LinkedIn user experience, creating powerful data-driven products to help LinkedIn’s members be productive and successful. Yael champions data quality at LinkedIn; she has devised organizational best practices for data quality and developed internal data tools to democratize data within the company. Yael also advises companies on informatics methodologies to transform high-throughput data into insights and is a frequent conference speaker. She holds a PhD in biomedical informatics from the Stanford University School of Medicine, where her research focused on information extraction via natural language processing to understand how human genetic variations impact drug response, and an MSc from the Weizmann Institute of Science in Israel.
Katie Kent is the Product Manager for Galvanize Enterprise, the learning community for technology. In this role she builds executive and contributor training in software development, data science, and data engineering. Katie was part of the founding of data science training startup Zipfian Academy, where she was responsible for growth of the business from concept to acquisition. Previously Katie worked in venture capital, working with startups building data- and design-driven products. Katie’s academic background is in environmental social science research at the University of Michigan.
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.