The National Cancer Institute Cloud Resources, formerly the NCI Cancer Genomics Cloud Pilots, were developed with the goal of democratizing access to NCI-generated cancer genomic data and facilitating analysis by co-localizing petabyte-scale data with cloud computing resources. Based on commercial cloud architectures, the Cloud Resources offer users the flexibility and reproducibility of utilizing tools in the form of Docker containers, and tools can be joined to create workflows described by Common Workflow Language (CWL) or Workflow Description Language (WDL). In addition, two of the Cloud Resources support interactive analysis using Jupyter notebooks as an integrated feature on the platform. The Broad Institute’s FireCloud, built on the Google Cloud platform, integrated Jupyter Notebooks into workspaces. In these shareable computational sandboxes, researchers organize and store their genetic datasets, as well as run analysis workflows. With the addition of Notebooks, researchers can perform tertiary analysis with data stored in workspaces or any FireCloud-managed GCP resource without additional authentication. A Python FireCloud client (FISS) can be utilized to access workspaces or other FireCloud objects. On the Seven Bridges Cancer Genomics Cloud (CGC), researchers can use the JupyterLab environment for custom scripting in R, Python, and Julia through an interactive analysis feature called Data Cruncher. Data Cruncher is accessible through custom workspaces on the CGC, where researchers can organize files, run complementary analyses on AWS using both Dockerized tools and Data Cruncher, and share data, tools, and notebooks with collaborators. Both Data Cruncher notebooks and Dockerized tools can be used for collaboratively exploring and mining data that are publicly available through the CGC, including multi-omic datasets from the TCGA (The Cancer Genome Atlas) and TARGET (Therapeutically Applicable Research To Generate Effective Treatments) initiatives, as well as private data uploaded or generated by researchers. Through the Cloud Resources and these associated Jupyter Notebook and JupyterLab features, users can seamlessly integrate interactive, exploratory analysis with other types of cancer analysis pipelines.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org