Presented By O'Reilly and Cloudera
Make Data Work
Feb 17–20, 2015 • San Jose, CA

Tensor Methods for Large-scale Unsupervised Learning: Applications to Topic and Community Modeling

Anima Anandkumar (UC Irvine)
1:30pm–2:00pm Wednesday, 02/18/2015
Hardcore Data Science
Location: LL20 BC.
Average rating: ****.
(4.00, 6 ratings)

In many applications we have the challenging task of
unsupervised learning of latent variable models. For instance, in
topic modeling, we need to extract hidden topics from a document
corpus, and in community modeling, we need to find hidden communities
in large-scale social networks. I will demonstrate how to exploit
tensor methods for learning. Tensors are higher order generalizations
of matrices, and are useful for representing rich information
structures. Tensor factorization involves finding a compact
representation of the tensor using simple linear and multilinear
algebra. These methods are embarrassingly parallel, accurate and
extremely fast to run. We obtain orders of magnitude gain in running
times compared to likelihood based methods on many datasets such as bag
of words, facebook, yelp and dblp.

Photo of Anima Anandkumar

Anima Anandkumar

UC Irvine

Anima Anandkumar is a faculty at the EECS Dept. at U.C.Irvine
since August 2010. Her research interests are in the area of
large-scale machine learning and high-dimensional statistics. She
received her B.Tech in Electrical Engineering from IIT Madras in 2004
and her PhD from Cornell University in 2009. She has been a visiting
faculty at Microsoft Research New England in 2012 and a postdoctoral
researcher at MIT between 2009-2010. She is the recipient of the Alfred.P. Sloan
Fellowship, Microsoft Faculty Fellowship, ARO Young Investigator
Award, NSF CAREER Award, IBM Fran Allen PhD fellowship, thesis award
from ACM SIGMETRICS society, and paper awards from the ACM SIGMETRICS
and IEEE Signal Processing societies.