Feature engineering with Spark NLP to accelerate clinical trial recruitment
Who is this presentation for?
- Data scientists, machine learning engineers, and engineering leaders
Level
Description
Recruiting patients for clinical trials is a major challenge in drug development. Finding patients requires an in-depth understanding of their medical histories and current health statuses while the majority of patient data is unstructured and spread across physician notes, pathology, imaging, genomic, and other reports. For this reason, clinical trial recruitment is a slow and manual process.
Saif Addin Ellafi and Scott Hoch dive into a case study that describes how Deep 6 uses the Spark natural language processing (NLP) platform to apply state-of-the-art deep learning to accurately extract the relevant clinical facts from unstructured text. These facts are then used in subsequent data science pipelines in constructing patients’ medical histories.
John Snow Labs’s NLP library for Apache Spark is an open source library that provides natural language understanding capabilities with state-of-the-art accuracy, performance, and scale. It provides deep learning-based NLP algorithms for named entity recognition, spell checking, sentiment analysis, assertion status detection, entity resolution, optical character recognition (OCR), and sentence segmentation, and it enables highly efficient training of domain-specific machine learning and deep learning NLP models.
They explain how Deep 6 uses Spark NLP to scale its training and inference pipelines to millions of patients while achieving state-of-the-art accuracy. They explore the technical challenges, the architecture of the full solution, and the lessons Deep 6 learned that you can directly apply to your next natural language understanding project.
Prerequisite knowledge
- Familiarity with NLP, Spark, and machine learning
What you'll learn
- Discover lessons learned and recommendations for achieving state-of-the-art NLP accuracy, performance, and scale in a real-life application
Saif Addin Ellafi
John Snow Labs
Saif Addin Ellafi is a software developer at John Snow Labs, where he’s the main contributor to Spark NLP. A data scientist, forever student, and an extreme sports and gaming enthusiast, Saif has wide experience in problem solving and quality assurance in the banking and finance industry.
Scott Hoch
BlackBox Engineering
Scott Hoch is the founder of Blackbox Engineering.
Presented by
Elite Sponsors
Strategic Sponsors
Zettabyte Sponsors
Contributing Sponsors
Exabyte Sponsors
Content Sponsor
Impact Sponsors
Supporting Sponsor
Non Profit
Contact us
confreg@oreilly.com
For conference registration information and customer service
partners@oreilly.com
For more information on community discounts and trade opportunities with O’Reilly conferences
strataconf@oreilly.com
For information on exhibiting or sponsoring a conference
pr@oreilly.com
For media/analyst press inquires