Sep 23–26, 2019

Implementing ML models into production at Statistics Canada

Richard Evans (Statistics Canada)
9:30am10:00am Tuesday, September 24, 2019
Location: 1A 06

Level

Beginner

Data Science has arrived at Statistics Canada. Richard Evans explains how it happened.

Statistics Canada (Statcan) is a branch of the federal government and is Canada’s national statistical organization. It’s responsible for producing the country’s key economic and social indicators, such as the census of population, the unemployment and inflation rates, measures of health outcomes, etc. The arrival of big data and AI present undreamed-of opportunities for Statcan’s statisticians to produce vastly more accurate, detailed, and relevant statistics than ever before. Although having many advantages which made it well placed to leverage these opportunities (such as operating under a powerful legal framework, having a strong information management culture and practices, an illustrious history of innovation, and deep expertise in data processing), the fact Statcan was focused primarily on processing surveys (i.e., "small data”) posed certain challenges.

Join Richard to see how Statcan transitioned from longstanding, large, predictable survey programs employing data processing methods that were perfected in the ’90s and ’00s to Agile teams deploying ML models running on vastly larger un/structured datasets, often embedded into legacy data processing structures. The transition to developing the necessary data science capacity to process large un/structured datasets led to innovations in many areas, including:

  • HR: How to attract data science talent, manage them, keep them motivated, and retain them as well as the role of branding and autonomy
  • Cultural: How to gain acceptance within the organization and develop a more nimble and responsive culture (Lean Startup culture)
  • Policy-related: Who should write mission-critical code? Who should develop models? Who should vet them?
  • Organizational: Where did such a unit belong? (The centralized/decentralized dilemma)
  • IT: Moving to open source and the cloud
  • Leadership: What kind of leader should a data science unit have? How should it be led?
  • Impact: How should results be measured? (Key performance metrics)

Innovations in all the above areas led to the successful creation of a data science team that has gone from 4 to more than 54 use cases in various stages of completion in the span of a year.

Photo of Richard Evans

Richard Evans

Statistics Canada

Richard Evans is a 28-year veteran at Statistics Canada, Institut national de la statistique et des études économiques (Insee). He’s an expert in high-frequency economic indicators, a transformative leader, and an architect and project executive of the CPI Enhancement Initiative. Richard is passionate about using data science and AI to create user-centric data products from big data sources and is a recruiter of tomorrow’s statistical leaders.

Comments on this page are now closed.

Comments

Ana San Martin | GOBIERNO DEL dato
09/25/2019 6:37pm EDT

can you send the presentation please, I found it super interesting. Thank you

  • Cloudera
  • O'Reilly
  • Google Cloud
  • IBM
  • Cisco
  • Dataiku
  • Intel
  • Io-Tahoe
  • MemSQL
  • Microsoft Azure
  • Oracle Cloud Infrastructure
  • SAS
  • Arcadia Data
  • BMC Software
  • Hazelcast
  • SAP
  • Amazon Web Services
  • Anaconda
  • Esri
  • Infoworks.io, Inc.
  • Kyligence
  • Pitney Bowes
  • Talend
  • Google Cloud
  • Confluent
  • DataStax
  • Dremio
  • Immuta
  • Impetus Technologies Inc.
  • Keyence
  • Kyvos Insights
  • StreamSets
  • Striim
  • Syncsort
  • SK holdings C&C

    Contact us

    confreg@oreilly.com

    For conference registration information and customer service

    partners@oreilly.com

    For more information on community discounts and trade opportunities with O’Reilly conferences

    strataconf@oreilly.com

    For information on exhibiting or sponsoring a conference

    pr@oreilly.com

    For media/analyst press inquires