Essential regulatory elements of high-level amplicons

Junior Clinician Scientist Project

Genomic amplification is the most common genetic gain-of-function variant in cancer. How the extra oncogene copies are active in a different regulatory context than the original remains unclear. We hypothesize that they exploit either tissue-specific or oncogene-specific enhancers and transcription factors. Discovering the factors that drive oncogenic transcription has the potential to reveal cancer-specific dependencies. Indeed, transcription factors (TFs) binding enhancers on high-level MYCN amplicons in neuroblastoma are part of its core regulatory circuit and represent lineage-specific growth dependencies. However, which TFs bind to the enhancers on amplicons of oncogenes in other tumor types remains unknown. We will therefore test the following specific hypotheses in breast and lung cancer samples:

(I) Enhancers on high-level amplicons are cancer-type or oncogene specific. To test this hypothesis, we will compare the enhancers that correlate with expression between amplicons of the same oncogene in different cancer types and between different oncogenes in the same cancer type.

(II) The enhancers contain (tissue or oncogene) specific TF binding motifs to allow oncogene expression. We will scan for known TF binding motifs in the enhancer peaks and test for enrichment by cancer type and oncogene.

(III) These TFs and their binding to amplicons are necessary for tumor growth. We will test the TFs for lineage-specific dependency in existing CRISPR-ko data, providing hypotheses for subsequent validation.

We will integrate the largest publicly available datasets of WGS, RNAseq, ATAC/ChIPseq, and standardized genome-scale CRISPR screens. Our findings will be a valuable resource to understand the regulation of oncogenes in breast and lung cancer and our workflows can serve as templates to extend this analysis to other cancer types or new datasets. Our approach to focus on the regulatory units of the high-level amplicons of oncogenes instead of comparing genome-scale epigenetic profiles has only become possible with the recent availability of large WGS cohorts. Public data from genome-scale CRISPR screens across cell lines from multiple lineages now also allows us to see if the TFs we identify based on the tumor samples show growth phenotypes when knocked out in cell lines. Using this integrated approach, we aim to identify cancer-specific vulnerabilities that arise from the regulatory machinery necessary to drive expression of oncogenes.