“Anyone who does not have the command line at their beck and call is really missing something,” tweeted Tim O’Reilly when Jeroen Janssens’s Data Science at the Command Line was recently made available online for free. As Tim’s tweet suggests, the command line (and its ecosystem of power tools) is not just standing the test of time; it’s more popular than ever. Join Jeroen to learn what you’re missing out on if you’re not applying the command line and many of its power tools to typical data science problems.
The Unix command line isn’t just available on web servers, wireless routers, and supercomputers. It can also be found on macOS, the Raspberry Pi, and, most recently, Windows 10. Although invented decades ago, it turns out to be an amazing environment for efficiently performing tedious but essential data science tasks—and in some situations, it even outperforms new technologies. By combining small, powerful command-line tools like grep, sort, awk, parallel, jq, and csvsql, you can quickly obtain, scrub, explore, and even model your data.
If you’ve ever wondered what the command line is or what it can do for you, this session is for you. Jeroen walks you through applying the command line to some typical data science problems and covers the core concepts of the command line. You’ll learn how to break a data science problem into smaller problems, choose the appropriate command-line tools, and chain them together and how to integrate the command line with your existing data science workflow, whether it consists of the Jupyter Notebook, R, or Excel. You’ll leave ready to get started with the command line and will probably want to learn more about this exciting piece of technology. And why not? It’s been around for almost 50 years. It’s not like it’s going anywhere soon.
Jeroen Janssens is the founder, CEO, and an instructor of Data Science Workshops, which provides on-the-job training and coaching in data visualization, machine learning, and programming. Previously, he was an assistant professor at Jheronimus Academy of Data Science and a data scientist at Elsevier in Amsterdam and startups YPlan and Outbrain in New York City. He’s the author of Data Science at the Command Line (O’Reilly). Jeroen holds a PhD in machine learning from Tilburg University and an MSc in artificial intelligence from Maastricht University.
©2018, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com