As machine learning algorithms and artificial intelligence continue to progress, we must take advantage of the best techniques from various disciplines. Funda Gunes demonstrates how combining well-proven methods from classical statistics can enhance modern deep learning methods in terms of both predictive performance and interpretability and offers a summary of recent studies along with examples from various fields.
Recent research shows that most successful models are developed when well-established statistical modeling techniques are used directly in the model training stage of machine learning models rather than using them only to add post hoc interpretation to previously trained machine learning models. Training models this way produces more stable models with respect to data perturbations, and these models can then be used more safely to discover hidden patterns of high-order interactions in the data and help increase the prediction performance. Some examples of using such techniques include using gradient boosting when training generalized additive models (GAMs), using a quantile regression loss function when training a boosted trees model, using an additional outer layer of bootstrap sampling to produce a more stable variable importance table for random forest models, and using mixed model random effects as engineered features.
Funda Gunes is a senior machine learning developer at SAS, where she researches and implements new data mining and machine learning approaches. Her research interests include regularization in machine learning algorithms, Bayesian statistical modeling, mixed models, stacked ensemble models, and using classical statistical methods to enhance deep learning models. Funda holds a PhD in statistics from North Carolina State University.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com