Identifying Outliers at Scale Using Real-time Search Engines

Costin Leau (Elastic)
Hadoop & Beyond
Location: 212
Average rating: **...
(2.89, 9 ratings)

Not too long ago, information retrieval systems (also known as search
engines) were used mainly in academia; however, the explosion of data
has drastically changed our use of and needs from IR. From finding an
open restaurant nearby us to sifting through our user and machine
generated data streams to find signal within the noise, search is now
the foundational element for meeting business objectives and enjoying
our daily existence.

This session will demonstrate how search engines are used for so much
more than just free-text search. Costin will focus on using search as
any analytics platform to spot the exceptional in your data.. He’ll
open with an exploration of the architectural challenges of large
datasets, then explore the bridging of batch-oriented and real-time
systems. Using Apache Hadoop as a data platform, Elasticsearch as a
search engine and Apache Spark for data processing, Costin will teach
you how to move beyond free-text search to performing fine-grained
analytics, revealing the most significant – and immediately
actionable – outliers within your data set.

Photo of Costin Leau

Costin Leau


Costin Leau is an engineer at Elasticsearch, currently working with NoSQL and Big Data technologies on the Elasticsearch for Apache Hadoop project. An open-source veteran, Costin led various Spring projects (Spring OSGi, GemFire, Redis, Hadoop) and authored an OSGi spec. Speaker at various editions of EclipseCon/OSGi DevCon, JavaOne, Devoxx/Javapolis, JavaZone, SpringOne, TSSJS on Java/Hadoop/Spring related topics.