PCaMiner - Prototyping the harmonization and the integrated mining of prostate cancer RNA se-quencing data sets
Next-generation sequencing (NGS) has revolutionized biomedicine by enabling us to shed light on disease-specific molecular alterations. Because NGS raw data is frequently protected and the analysis labor-intensive, specific web portals have emerged that enable integrated mining of various DNA data sets (e.g., cbioportal). In contrast, the cross-study comparison of RNA sequencing data is significantly more challenging because of the artifacts introduced by different analysis pipelines, referred to as “batch effects”. We recently overcame these hurdles and generated a harmonized prostate cancer RNA sequencing atlas from different data sets of early primary to late-stage metastatic disease. This enabled us to describe the roadmap to tumor progression in an unprecedented quantitative manner.
The goal of this project is to update the atlas with newly released data sets and establish a user-friendly web tool for researchers and clinicians in academia and health care to accurately investigate gene expression changes related to prostate cancer progression and therapy resistance in a customized and multidimensional manner. Researchers will use the data to identify drug targets to prevent disease progression or develop biomarkers related to prognosis or drug response. Clinicians may use the tool to determine specific therapeutic opportunities for their patients.
As such, the harmonized RNA sequencing atlas and the related web tool will extract relevant information from various protected data sets and make them findable, accessible, reusable, and interoperable (FAIR Principles). More broadly, the deliveries may establish a blueprint for RNA sequencing Open Data Research in other cancer types or more largely other diseases