Engaging critically with data is now a required skill for students in all areas, but many traditional data science programs aren’t easily accessible to those without prior computing experience. Gunjan Baid and Vinitra Swamy explore UC Berkeley’s Data Science program, which has no math, computing, or statistics prerequisites and is designed to be accessible to students of all backgrounds. At the introductory level, the program consists of a fundamentals course that introduces students to concepts of computer programming and statistics, and there is a diverse set of connector courses that allow students to apply data science to their area of interest, such as geography, immunotherapy, or cognitive science.
Using Jupyter notebooks, students are able to get hands-on experience working with data without the burden of setting up and maintaining a development environment. The program has developed a tool that allows students to obtain notebooks and datasets for an assignment with one click, and autograding, user authentication, and submission are all done through Jupyter notebooks, enabling instructors to focus on real-world issues, such as racial profiling and California water usage, instead of the technical details surrounding the computing infrastructure. The effectiveness of this approach is shown by the numbers: over 2,000 students across 50 majors have taken the fundamentals course and the connector courses in the past four semesters.
Gunjan and Vinitra explain the program in more detail and expand upon the pedagogical challenges faced in scaling Jupyter notebooks for use in large courses. They conclude by discussing how the program’s vision can be applied more generally for teaching data science using Jupyter at other universities and institutions.
Gunjan Baid is a student at University of California, Berkeley. She completed her bachelor’s degree in computer science and biochemistry and is now pursuing a master’s degree in computer science with a research focus on computational biology. Gunjan is associated with the undergraduate Data Science education program, where as a student instructor, she worked with Jupyter notebooks in the classroom and now provides technical support for the program’s JupyterHub infrastructure.
Vinitra Swamy graduated two years early with a bachelor’s degree in computer science from the University of California, Berkeley, and is now working toward a master’s degree in computer science. Her research interests include data science, cloud computing environments, and natural language processing. Vinitra is head student instructor for Berkeley’s new Foundations of Data Science course, helping shape curriculum and educating thousands of students from diverse backgrounds. Her efforts in data science education were recently recognized with a Berkeley EECS award of excellence in teaching and leadership. Vinitra also leads a Jupyter development student research team within the Data Science Education program and assists with the technical deployment and use of JupyterHub infrastructure campus-wide.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org