Parquet Modular Encryption: Confidentiality and Integrity of Sensitive Column Data
Who is this presentation for?Systems and security architect, CTO, software engineer
Apache Parquet is a popular columnar format, leveraged in many analytic frameworks for efficient storage and processing of big data. In many real-life use cases, parts of the data are highly sensitive and must be protected. The Parquet community is working on a column encryption mechanism that secures confidentiality and integrity of the sensitive Parquet data, and enables access control for table columns. The modular design of the mechanism preserves the existing projection, predicate pushdown, encoding and compression capabilities of Parquet, required for analytic workload acceleration.
Today, many leading companies in the big data and cloud domains take part in the community work on this technology. The specification of the Parquet modular encryption has been recently completed and formally approved by the Apache Parquet PMC (project management committee).
In this talk, I will present the basics of the columnar encryption technology, its usage model and an initial integration with analytic frameworks (e.g., Apache Spark). I will show two usecases – one related to connected cars (location, speed and other sensitive data), another to healthcare data processing (medical sensor records, managed by the increasingly popular HL7 FHIR standard). I will also describe the performance implications of applying modular encryption in analytic workloads.
Prerequisite knowledgeBasic understanding of big data
What you'll learn
Gidon is a lead architect at the IBM Research – Haifa Laboratory. He works on secure cloud analytics, data-at-rest and data-in-use encryption, attestation of trusted computing enclaves. Currently, Gidon plays a leading role in the Apache Parquet community work on protecting sensitive data in analytic workloads. Gidon has completed a Ph.D degree in the Weizmann Institute of Science in Israel, and was a Post-Doctoral fellow in the Columbia University, NYC.
Leave a Comment or Question
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
View a complete list of Strata Data Conference contacts