Dealing With Bad Data

Q McCallum (@qethanm)
Data Science, Mission City B1
Average rating: *....
(1.00, 1 rating)

The biggest problem in data science is … the data itself.

It’s messy, it’s inconsistent, it arrives from myriad
sources, and it sometimes changes without warning. Such
hurdles distract you from your intended purpose: getting
meaningful insight out of your data.

Q Ethan McCallum, consultant and author of Parallel R
(O’Reilly), will walk through the various forms of bad data
and explore common pitfalls that can derail your research
efforts. Most of all, he’ll explain ways to handle bad data
so you can get back to work.

Photo of Q McCallum

Q McCallum


Q Ethan McCallum is a consultant, writer, and technology enthusiast, though perhaps not in that order. Most recently put the finishing touches on Parallel R (O’Reilly).


  • EMC
  • Microsoft
  • HPCC Systems™ from LexisNexis® Risk Solutions
  • MarkLogic
  • Shared Learning Collaborative
  • Cloudera
  • Digital Reasoning Systems
  • Pentaho
  • Rackspace Hosting
  • Teradata Aster
  • VMware
  • IBM
  • NetApp
  • Oracle
  • 1010data
  • 10gen
  • Acxiom
  • Amazon Web Services
  • Calpont
  • Cisco
  • Couchbase
  • Cray
  • Datameer
  • DataSift
  • DataStax
  • Esri
  • Facebook
  • Feedzai
  • Hadapt
  • Hortonworks
  • Impetus
  • Jaspersoft
  • Karmasphere
  • Lucid Imagination
  • MapR Technologies
  • Pervasive
  • Platform Computing
  • Revolution Analytics
  • Scaleout Software
  • Skytree, Inc.
  • Splunk
  • Tableau Software
  • Talend

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners

For media-related inquiries, contact Maureen Jennings at

View a complete list of Strata contacts