Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Information architecture for an enterprise data cloud

Mark Samson (Cloudera), Phillip Radley (BT)
17:2518:05 Wednesday, 1 May 2019
Average rating: *****
(5.00, 2 ratings)

Who is this presentation for?

  • Architects

Level

Intermediate

Prerequisite knowledge

  • A high-level understanding of open source data platforms

What you'll learn

  • Understand information architecture

Description

With the range of open source data storage, analytics, and data science tools now available, it’s possible to build a modern data platform capable of storing, processing, and analyzing a wide variety of data across multiple public and private cloud platforms and on-premises data centers. However, a platform with such broad capability triggers a question: How do you organize the myriad datasets in a way that allows users to explore all the data, discover new datasets, and perform data engineering, analytics, and data science to provide value to the business?

Mark Samson and Phillip Radley answer that question by outlining an information architecture for a modern data platform, informed by working with multiple large organizations that have built such platforms over the last five years. The architecture is composed of a number of layers or zones that are designed to allow an organization to

  • Ingest data in its full fidelity, in as close to its original, raw form as possible;
  • Provide a data discovery and exploration facility for data scientists and analysts;
  • Bring together and link multiple datasets to provide a business-wide data model;
  • Create views of the data that are optimized for the access patterns generated by particular machine learning and analytics use cases.

Mark and Phillip describe the layers required in an information architecture that can provide these functions, with reference to the particular open source and cloud-based technologies that enable them.

Photo of Mark Samson

Mark Samson

Cloudera

Mark Samson is a principal systems engineer at Cloudera, helping customers solve their big data problems using enterprise data hubs based on Hadoop. Mark has 17 years’ experience working with big data and information management software in technical sales, service delivery, and support roles.

Photo of Phillip Radley

Phillip Radley

BT

Phillip Radley is chief data architect on the core enterprise architecture team at BT, where he’s responsible for data architecture across the company. Based at BT’s Adastral Park campus in the UK, Phill leads BT’s MDM and big data initiatives, driving associated strategic architecture and investment road maps for the business. He’s worked in IT and communications for 30 years. Previously, Phill was been chief architect for infrastructure performance-management solutions from UK consumer broadband to outsourced Fortune 500 networks and high-performance trading networks. He has broad global experience, including with BT’s concert global venture in the US and five years as an Asia Pacific BSS/OSS architect based in Sydney. Phill is a physics graduate with an MBA.