Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY

Pomegranate: Flexible probabilistic modeling for Python

Jacob Schreiber (University of Washington)
Secondary topics:  Pydata
Average rating: ****.
(4.00, 6 ratings)

Jacob Schreiber offers an overview of pomegranate, a flexible probabilistic modeling package implemented in Cython for speed. Jacob explores the models it supports, such as Bayesian networks and hidden Markov models, and demonstrates that these models are both faster and more flexible than other implementations in the open source community, such as NumPy, SciPy, scikit-learn, and hmmlearn.

Jacob also explains how to utilize the underlying modularity of the code to stack these models to produce more complicated ones such as mixtures of Bayesian networks or HMMs with complicated mixture emissions and shows how easy it is to use the built-in out-of-core and parallel APIs to allow for multithreaded training of complex models on massive amounts of data which can’t fit in data—all without the user having to think about any implementation details.

Photo of Jacob Schreiber

Jacob Schreiber

University of Washington

Jacob Schreiber is a third-year CSE PhD student and IGERT big data fellow at the University of Washington. Jacob is a core developer for the popular Python machine learning package sklearn and the author of a probabilistic modeling Python package pomegranate.