Real-Time Searching of Big Data with Solr and Hadoop

Practitioner
Location: Mission City B1
Average rating: ***..
(3.83, 12 ratings)

Hadoop and HBase make it easy to store terabytes of data, but how do you scale your search mechanism to sift through these mountains of bits and retrieve large result sets in a matter of milliseconds?

The Solr search server, based on Lucene, provides a scalable querying capability that nicely complements HBase. In this session, we’ll use OpenLogic’s production Solr and Hadoop environment as a case study on how you can handle rapid fire queries against terabytes of data, primarily through a combination of index sharding and fault-tolerant load balancing.

We’ll also learn how to avoid pitfalls when initially loading your data, pros and cons of putting Big Data in a public cloud, and tips on how to make your implementation go more smoothly.

Come to this session to learn how you can put scalable, high-performance search to work on your own Big Data.

Photo of Rod Cope

Rod Cope

OpenLogic, Inc.

Rod Cope is the CTO and Founder of OpenLogic, a provider of Open Source support and governance solutions for the enterprise.  He has over 25 years of software development experience in a wide range of industries and technologies.

Prior to founding OpenLogic, Rod worked for General Electric, IBM, IBM Global Services, and Anthem before starting his own consulting company. As a consultant, he has architected solutions for Ericsson, Ford, Manugistics, Integral, Goodyear, and many other companies of all sizes.

Rod has spoken on various technical and business topics at OSCON, JavaOne, AJAXWorld, RailsConf, the Open Source Business Conference, the Next Generation Data Center conference, and other venues around the United States and Europe.

He holds both Bachelor’s and Master’s degrees in Software Engineering from the University of Louisville.

Comments on this page are now closed.

Comments

Picture of Ron Marcelle
Ron Marcelle
02/04/2011 7:59am PST

Xinh, sorry for the trouble. I’ve zipped up this PowerPoint and re-uploaded, hopefully you can download this now

Picture of Xinh Huynh
Xinh Huynh
02/03/2011 8:55am PST

Nice talk! But, I can’t seem to download the presentation (.pptx)?

Picture of Brendan Sterne
Brendan Sterne
02/03/2011 1:52am PST

A very practical explanation of OpenLogic’s architecture involving Hadoop, HBase, Solr and more – with plenty of advice for people setting up their own private cluster. Great talk!

Sponsors

  • Thomson Reuters
  • EMC Data Computing Division
  • EnterpriseDB
  • Microsoft
  • Gnip
  • Rackspace Hosting
  • IBM
  • Windows Azure MarketPlace DataMarket
  • Amazon Mechanical Turk
  • Amazon Web Services
  • Aster Data
  • Cloudera
  • Clustrix
  • DataStax, Inc. (formerly Riptano, Inc.)
  • Digital Reasoning Systems
  • Heritage Provider Network
  • Impetus
  • Jaspersoft
  • Karmasphere
  • LinkedIn
  • MarkLogic
  • Pentaho
  • Pervasive
  • Revolution Analytics
  • Splunk
  • Urban Mapping
  • Wolfram|Alpha
  • Esri
  • ParAccel
  • Tableau Software

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Young at syoung@oreilly.com

Download the Strata Sponsor/Exhibitor Prospectus

Contact Us

View a complete list of Strata Contacts