Title: Linking aberrant chromatin features in chronic lymphocytic leukemia to transcription factor networks
Abstract: Article22 May 2019Open Access Transparent process Linking aberrant chromatin features in chronic lymphocytic leukemia to transcription factor networks Jan-Philipp Mallm Jan-Philipp Mallm Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Search for more papers by this author Murat Iskar Murat Iskar Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany Search for more papers by this author Naveed Ishaque Naveed Ishaque Division of Theoretical Bioinformatics and Heidelberg Center for Personalized Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany Search for more papers by this author Lara C Klett Lara C Klett Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Faculty of Biosciences, Heidelberg University, Heidelberg, Germany Search for more papers by this author Sabrina J Kugler Sabrina J Kugler Mechanisms of Leukemogenesis, German Cancer Research Center (DKFZ), Heidelberg, Germany Department of Internal Medicine III, University Hospital Ulm, Ulm, Germany Search for more papers by this author Jose M Muino Jose M Muino orcid.org/0000-0002-6403-7262 Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany Search for more papers by this author Vladimir B Teif Vladimir B Teif orcid.org/0000-0002-5931-7534 School of Biological Sciences, University of Essex, Colchester, UK Search for more papers by this author Alexandra M Poos Alexandra M Poos Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Faculty of Biosciences, Heidelberg University, Heidelberg, Germany Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology—Hans Knöll Institute Jena, Jena, Germany Search for more papers by this author Sebastian Großmann Sebastian Großmann Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Search for more papers by this author Fabian Erdel Fabian Erdel orcid.org/0000-0003-2888-7777 Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Centre de Biologie Intégrative (CBI), CNRS, UPS, Toulouse, France Search for more papers by this author Daniele Tavernari Daniele Tavernari orcid.org/0000-0001-5981-7363 Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Search for more papers by this author Sandra D Koser Sandra D Koser Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany Search for more papers by this author Sabrina Schumacher Sabrina Schumacher Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Search for more papers by this author Benedikt Brors Benedikt Brors Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany Search for more papers by this author Rainer König Rainer König Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology—Hans Knöll Institute Jena, Jena, Germany Search for more papers by this author Daniel Remondini Daniel Remondini orcid.org/0000-0003-3185-7456 Department of Physics and Astronomy, Bologna University, Bologna, Italy Search for more papers by this author Martin Vingron Martin Vingron Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany Search for more papers by this author Stephan Stilgenbauer Stephan Stilgenbauer Department of Internal Medicine III, University Hospital Ulm, Ulm, Germany Search for more papers by this author Peter Lichter Peter Lichter Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany German Cancer Consortium (DKTK), Heidelberg, Germany Search for more papers by this author Marc Zapatka Marc Zapatka orcid.org/0000-0001-8287-5967 Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany Search for more papers by this author Daniel Mertens Corresponding Author Daniel Mertens [email protected] orcid.org/0000-0003-0227-7188 Mechanisms of Leukemogenesis, German Cancer Research Center (DKFZ), Heidelberg, Germany Department of Internal Medicine III, University Hospital Ulm, Ulm, GermanyThese authors contributed equally to this work as senior authors Search for more papers by this author Karsten Rippe Corresponding Author Karsten Rippe [email protected] orcid.org/0000-0001-9951-9395 Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, GermanyThese authors contributed equally to this work as senior authors Search for more papers by this author Jan-Philipp Mallm Jan-Philipp Mallm Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Search for more papers by this author Murat Iskar Murat Iskar Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany Search for more papers by this author Naveed Ishaque Naveed Ishaque Division of Theoretical Bioinformatics and Heidelberg Center for Personalized Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany Search for more papers by this author Lara C Klett Lara C Klett Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Faculty of Biosciences, Heidelberg University, Heidelberg, Germany Search for more papers by this author Sabrina J Kugler Sabrina J Kugler Mechanisms of Leukemogenesis, German Cancer Research Center (DKFZ), Heidelberg, Germany Department of Internal Medicine III, University Hospital Ulm, Ulm, Germany Search for more papers by this author Jose M Muino Jose M Muino orcid.org/0000-0002-6403-7262 Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany Search for more papers by this author Vladimir B Teif Vladimir B Teif orcid.org/0000-0002-5931-7534 School of Biological Sciences, University of Essex, Colchester, UK Search for more papers by this author Alexandra M Poos Alexandra M Poos Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Faculty of Biosciences, Heidelberg University, Heidelberg, Germany Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology—Hans Knöll Institute Jena, Jena, Germany Search for more papers by this author Sebastian Großmann Sebastian Großmann Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Search for more papers by this author Fabian Erdel Fabian Erdel orcid.org/0000-0003-2888-7777 Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Centre de Biologie Intégrative (CBI), CNRS, UPS, Toulouse, France Search for more papers by this author Daniele Tavernari Daniele Tavernari orcid.org/0000-0001-5981-7363 Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Search for more papers by this author Sandra D Koser Sandra D Koser Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany Search for more papers by this author Sabrina Schumacher Sabrina Schumacher Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany Search for more papers by this author Benedikt Brors Benedikt Brors Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany Search for more papers by this author Rainer König Rainer König Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology—Hans Knöll Institute Jena, Jena, Germany Search for more papers by this author Daniel Remondini Daniel Remondini orcid.org/0000-0003-3185-7456 Department of Physics and Astronomy, Bologna University, Bologna, Italy Search for more papers by this author Martin Vingron Martin Vingron Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany Search for more papers by this author Stephan Stilgenbauer Stephan Stilgenbauer Department of Internal Medicine III, University Hospital Ulm, Ulm, Germany Search for more papers by this author Peter Lichter Peter Lichter Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany German Cancer Consortium (DKTK), Heidelberg, Germany Search for more papers by this author Marc Zapatka Marc Zapatka orcid.org/0000-0001-8287-5967 Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany Search for more papers by this author Daniel Mertens Corresponding Author Daniel Mertens [email protected] orcid.org/0000-0003-0227-7188 Mechanisms of Leukemogenesis, German Cancer Research Center (DKFZ), Heidelberg, Germany Department of Internal Medicine III, University Hospital Ulm, Ulm, GermanyThese authors contributed equally to this work as senior authors Search for more papers by this author Karsten Rippe Corresponding Author Karsten Rippe [email protected] orcid.org/0000-0001-9951-9395 Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, GermanyThese authors contributed equally to this work as senior authors Search for more papers by this author Author Information Jan-Philipp Mallm1,‡, Murat Iskar2,‡, Naveed Ishaque3,15,‡, Lara C Klett1,4, Sabrina J Kugler5,6, Jose M Muino7,16, Vladimir B Teif8, Alexandra M Poos1,4,9,10, Sebastian Großmann1,17, Fabian Erdel1,11, Daniele Tavernari1,18, Sandra D Koser12, Sabrina Schumacher1, Benedikt Brors12, Rainer König9,10, Daniel Remondini13, Martin Vingron7, Stephan Stilgenbauer6,19, Peter Lichter2,14, Marc Zapatka2, Daniel Mertens *,5,6 and Karsten Rippe *,1 1Division of Chromatin Networks, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany 2Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany 3Division of Theoretical Bioinformatics and Heidelberg Center for Personalized Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany 4Faculty of Biosciences, Heidelberg University, Heidelberg, Germany 5Mechanisms of Leukemogenesis, German Cancer Research Center (DKFZ), Heidelberg, Germany 6Department of Internal Medicine III, University Hospital Ulm, Ulm, Germany 7Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany 8School of Biological Sciences, University of Essex, Colchester, UK 9Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany 10Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology—Hans Knöll Institute Jena, Jena, Germany 11Centre de Biologie Intégrative (CBI), CNRS, UPS, Toulouse, France 12Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany 13Department of Physics and Astronomy, Bologna University, Bologna, Italy 14German Cancer Consortium (DKTK), Heidelberg, Germany 15Present address: Center for Digital Health and Charité—Universitätsmedizin Berlin, Berlin, Germany 16Present address: Institute for Biology, Systems Biology of Gene Regulation, Humboldt-Universität zu Berlin, Berlin, Germany 17Present address: Wellcome Trust Sanger Institute, Cambridge, UK 18Present address: Department of Computational Biology, University of Lausanne (UNIL), Lausanne, Switzerland 19Present address: Klinik für Innere Medizin I, Universitätsklinikum des Saarlandes, Homburg, Germany ‡These authors contributed equally to this work *Corresponding author. Tel: +49-731-50045870; E-mail: [email protected] *Corresponding author. Tel: +49-6221-5451450; E-mail: [email protected] Molecular Systems Biology (2019)15:e8339https://doi.org/10.15252/msb.20188339 PDFDownload PDF of article text and main figures. Peer ReviewDownload a summary of the editorial decision process including editorial decision letters, reviewer comments and author responses to feedback. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Abstract In chronic lymphocytic leukemia (CLL), a diverse set of genetic mutations is embedded in a deregulated epigenetic landscape that drives cancerogenesis. To elucidate the role of aberrant chromatin features, we mapped DNA methylation, seven histone modifications, nucleosome positions, chromatin accessibility, binding of EBF1 and CTCF, as well as the transcriptome of B cells from CLL patients and healthy donors. A globally increased histone deacetylase activity was detected and half of the genome comprised transcriptionally downregulated partially DNA methylated domains demarcated by CTCF. CLL samples displayed a H3K4me3 redistribution and nucleosome gain at promoters as well as changes of enhancer activity and enhancer linkage to target genes. A DNA binding motif analysis identified transcription factors that gained or lost binding in CLL at sites with aberrant chromatin features. These findings were integrated into a gene regulatory enhancer containing network enriched for B-cell receptor signaling pathway components. Our study predicts novel molecular links to targets of CLL therapies and provides a valuable resource for further studies on the epigenetic contribution to the disease. Synopsis Transcriptome profiling and genome-scale mapping of DNA methylation, seven histone modifications, nucleosome positions, chromatin accessibility, EBF1 and CTCF binding are performed in B cells from CLL patients and healthy donors. Altered chromatin features were detected at 80% of the differentially regulated genes in CLL and histone deacetylase activity was globally increased. Half of the CLL genome comprised partially DNA methylated domains that were transcriptionally downregulated, demarcated by CTCF and enriched for H3K9me3 and H3K27me3. H3K4me3 was redistributed at CLL promoters, including loss of bivalent H3K4me3/H3K27me3 states, and substantial changes of enhancer activity were detected. A gene regulatory network including enhancers was constructed around the transcription factors targeting 15 central binding motifs that were associated with aberrant CLL chromatin features. Genes involved in BCR signaling were enriched in the network. Introduction Genomic sequence analysis has identified a comprehensive set of leukemogenic candidate genes in chronic lymphocytic leukemia (CLL; Martin-Subero et al, 2013; Landau et al, 2015; Puente et al, 2015). However, how these genetic changes drive the cellular and clinical pathophenotype of the disease is currently an open question (Zenz et al, 2010; Kipps et al, 2017). The complex molecular pathogenesis of CLL involves microenvironmental stimulation via aberrant signaling including the B-cell receptor (BCR), NF-κB, IL-4, and TLR pathways, among others (Abrisqueta et al, 2009; Zenz et al, 2010; Hallek, 2015; Stilgenbauer, 2015; Kipps et al, 2017). The relevance of BCR signaling in CLL is underlined by the clinical success of BCR signaling inhibitors like ibrutinib (Byrd et al, 2013; Burger et al, 2015) and idelalisib (Furman et al, 2014), and by the prognostic impact of somatic hypermutations and the gene usage of the immunoglobulin itself (Zenz et al, 2010; Duhren-von Minden et al, 2012). Remarkably, apart from a biased usage of the immunoglobulin genes and mutations in the BCR complex in a specific small subset of CLL patients, there are no recurrent genetic mutations within the components of the BCR signaling cascade. Rather, CLL cells display a massive global transcriptional deregulation that is affecting intracellular pathways and microenvironmental signaling toward cellular survival (Burger & Chiorazzi, 2013). Thus, it appears that a diverse set of genetic lesions conspires with epigenetic aberrations to drive cancerogenesis in a manner that is only partially understood. The relevance of deregulated epigenetic signaling for CLL is apparent from a number of findings. Epigenetic aberrations in a mouse model of CLL are among the earliest detectable modifications (Chen et al, 2009), and the loss of tumor suppression in 13q14.3 involves transcriptional deregulation by an epimutation (Mertens et al, 2006). Genome-wide DNA hypomethylation was already early recognized in CLL cells (Wahlfors et al, 1992; Lyko et al, 2004), and more recently, a strong correlation with transcriptional activity was observed (Kulis et al, 2015). The DNA methylation status is a surrogate marker for CLL patient subgroups that overexpress the ZAP70 kinase and the mutational status of the BCR-immunoglobulin genes that allow prognostic dichotomization of CLL into more or less aggressive cases (Cahill et al, 2013; Claus et al, 2014). The epigenetic subtypes of CLL defined by the DNA methylome may become important for patient stratification as they are of prognostic relevance (Queiros et al, 2015). These epigenetic subtypes are correlated with the two genetically defined subgroups of CLL that express a non-mutated or mutated immunoglobulin heavy-chain variable region gene (IGHV) and reflect the tumor cell of origin in an epigenetic continuum of B-cell development (Kulis et al, 2015; Oakes et al, 2016). Here, we conducted a comprehensive characterization of the chromatin landscape in primary CLL cells. Our analysis revealed that the massive changes in the CLL-specific transcriptome can be linked to deregulated chromatin features and activity changes of a transcription factor (TF) network downstream of microenvironmental signaling cascades. Our comprehensive data set represents a rich resource for studying gene regulation and epigenomics in CLL. We exploited it to integrate chromatin features and TF binding with gene expression programs in CLL B cells and suggest molecular mechanisms for the aberrant survival of malignant CLL cells. Results Aberrant chromatin features identified in CLL In order to characterize CLL chromatin modifications in correlation with transcriptional activity, we analyzed the chromatin landscape and the transcriptome of CD19+ B cells from peripheral blood from 23 CLL patients and from 17 pools of non-malignant B cells (NBCs) of healthy donors (Figs 1 and EV1, Appendix Fig S1 and Table S1, Datasets EV1 and EV2). While a number of pathophysiological processes such as microenvironmental signaling occur in secondary lymphoid organs of CLL patients (Burger & Gribben, 2014), the comprehensive analysis of different epigenetic layers required the acquisition of sufficient numbers of CD19+ B cells and was therefore conducted from peripheral blood. CLL patients were selected to assess the fundamental changes in the original, untreated, and non-evolved disease including both disease subtypes of IGHV mutated and non-mutated samples. NBC pools were from age-matched healthy donors. Based on the genome-wide DNA methylation profiles, CLL samples could be assigned to B-cell maturation stages as shown previously (Kulis et al, 2015; Oakes et al, 2016; Fig EV1A). These developmental changes of epigenetic signals were excluded here for the identification of differentially methylated regions (DMRs) between CLL cells and NBCs. Figure 1. Chromatin feature annotation, open regions, and gene regulation Chromatin features mapped here displayed differences between CLL patients and NBCs from healthy donors. As an example, the TCF4 locus is shown for CLL1 and NBC donor H7 samples. The TCF4 gene encodes for a transcription factor from the E protein family. Based on the increased H3K4me1, H3K27ac, and ATAC signal, two predicted enhancer loci were marked that became active in CLL. Note that the y-axis for RNA-seq is scaled differently for CLL (8,000) and NBCs (100) to visualize that the TCF4 gene was not completely silenced but lowly expressed also in NBCs as evident also from the H3K36me3 mark. Light gray depicts active chromatin region and dark gray the confined enhancer locus coinciding with an open chromatin region. The chromatin state annotation is described in panel (B). Chromatin segmentation of co-occurring histone modifications by ChromHMM yielding a model with 12 chromatin states. The indicated emission parameters for the contributions of individual histone marks and the average amount of each state (Mb) for CLL and NBC samples are given. Chord diagram representation of genome-wide chromatin state changes between CLL and NBC. The amount of chromatin change is proportional to the size of the segments with each tick representing 4 Mb of chromatin. Color coding of chromatin states as in panel (B). Distribution of ˜ 24,400 annotated differentially accessible regions (ATAC-seq) in CLL compared to NBC samples ("CLL diff.") according to the chromatin state annotation. In total, 7,605 regions gained an ATAC-seq signal in CLL, while it was lost at 16,790 loci. Part of the computed B-cell gene regulatory network showing TCF4 and its deregulated target genes as well as some of the adjacent nodes. The GRN was used to calculate the activity of regulators like TCF4 based on their target gene expression. Color code: TFs, red; target genes, gray; chromatin modifier, blue. Download figure Download PowerPoint Click here to expand this figure. Figure EV1. Data set overview Assignment of CLL samples to B-cell developmental stages based on DNA methylation patterns. Unsupervised hierarchical clustering of the samples from Pearson's correlation coefficient (average linkage) for DNA methylation from WGBS. The analysis was carried out considering the most variable 1 million CpG sites. Same as panel (B) but computed from the gene expression profiles of 2,000 genes from RNA-seq. Same as panel (B) but computed for histone modifications at promoters from ChIP-seq. Samples cluster according to the modifications, underlining specificity of the experimental data, and separate inactive (H3K9me3 and H3K27me3) from the other active histone marks. Top: Unsupervised hierarchical clustering of the samples from Spearman's correlation coefficient (average linkage) for chromatin accessibility from ATAC-seq calculated from ˜ 120,000 accessible regions. Bottom: Distribution of fold changes in ATAC-seq signal from DiffBind between CLL and NBC samples. The data were fitted to a sum of three Gaussian functions. Threshold values were determined from the indicated cross-over points as described in Materials and Methods. Exemplary comparison of ATAC-seq data of all analyzed CLL IGHV mutated (n = 11), CLL IGHV unmutated (n = 8), and NBC (n = 7) samples in replicates (except for H10, H12, and H13) at the EBF1 locus. It contains regions with lost ATAC-seq signal in CLL as compared to NBC controls determined by the DiffBind analysis (red bars in bottom track "Difference"). Representative ChromHMM state annotations of CLL1 and H6 are depicted as color bars above the corresponding group. Download figure Download PowerPoint The different chromatin features we mapped are depicted at the transcription factor 4 (TCF4) locus as an example for a gene upregulated in CLL (Fig 1A). The readouts include DNA methylation by whole-genome bisulfite sequencing (WGBS), histone chromatin immunoprecipitation (ChIP-seq) of H3K4me1, H3K4me3, H3K9me3, H3K9ac, H3K27me3, H3K27ac, and H3K36me3, nucleosome occupancy from high-coverage MNase digestion followed by H3 ChIP-seq, and open chromatin sites identified by the assay for transposase-accessible chromatin (ATAC-seq). For selected samples, also ChIP-seq of EBF1 and CTCF was performed. In addition, RNA transcription was analyzed by strand-specific RNA-seq of long and short RNAs (Appendix Fig S1A). The added value of this comprehensive analysis is apparent from inspection of the TCF4 gene. The histone modifications predict downstream enhancers and intronic enhancers that become activated in CLL cells as judged from the enrichment of H3K4me1 and H3K27ac (Fig 1A). The predicted enhancer loci in this region were particularly extended (> 10 kb) and are therefore an example for so-called "super-enhancers" (SEs, see below; Whyte et al, 2013). In order to systematically evaluate histone modification changes, we annotated chromatin with a 12-state ChromHMM Hidden Markov model (Fig 1B). Chromatin states differed substantially between CLL samples and NBCs and showed transitions for repressive chromatin states 4, 5, and 6 (H3K9me3, H3K27me3) and potential enhancer states 1, 8, 9, and 11 (carrying H3K27ac and/or H3K4me1 while lacking H3K4me3; Fig 1B and C, Datasets EV4, EV6-EV7). To link changes of chromatin features with TF binding, we identified accessible chromatin with ATAC-seq. The method detects TF binding by mapping open and bona fide active chromatin regions that are depleted of nucleosomes. The differentially accessible regions in CLL patients and NBCs comprised 38,072 loci of which ~ 24,400 loci were located at the transcription start site (TSS), regions of transcription, and active or repressed regions (Figs 1D and EV1E). Loss of ATAC signal in repressed regions points to a more heterochromatic conformation in CLL, while at active chromatin regions, it might indicate a reduced promoter/enhancer activity. The IGHV mutated vs. non-mutated CLL can be distinguished according to the ATAC-seq profiles (Rendeiro et al, 2016). However, only ~ 1% of the differential ATAC-seq peaks identified here between CLL and NBCs were related to the heterogeneity of IGHV mutated and non-mutated IGHV CLL samples. This finding is illustrated for the EBF1 TF locus in Fig EV1F. The changes of the chromatin landscape were linked to the deregulated activity of TFs and chromatin modifiers in CLL according to the workflow depicted in Appendix Fig S1B. A B-cell-specific gene regulatory network (GRN) was constructed with the ARACNE framework (Alvarez et al, 2016). The GRN served as the backbone to integrate TFs and deregulated epigenetic signaling and comprised 2,804 regulators with a median value of 45 target genes. It was also used to compute the activity of TFs and chromatin modifiers from their target gene expression with our RNA-seq data. In total, 1,378 regulators displayed a differential activity between the CLL and NBC states (P < 0.05). As an example, TCF4 and selected deregulated target genes are shown in Fig 1E. Large repressive partially DNA methylated domains When comparing DNA methylation in CLL with NBC controls, we found a global hypomethylation in CLL as previously reported (Wahlfors et al, 1992; Lyko et al, 2004; Kulis et al, 2012). It was predominantly due to the formation of large partially methylated domains (PMDs; Figs 2A and EV2A, Appendix Fig S2A–D, Dataset EV3). Remarkably, the CLL DNA methylome contained a strikingly large genome fraction of ~ 50% PMDs in comparison with NBCs (< 1%; Fig 2A) with a significant overlap to PMDs previously identified for other tissues and cancer entities (Fig EV2B). The inter- and intra-sample variability of DNA methylation in CLL cells compared to NBC controls was high (P = 0.005, Wilcoxon rank-sum test, Appendix Fig S2C) and CLL cells harbored an increased fraction of intermediate DNA methylation within PMDs (P = 1.6E-4, Wilcoxon rank-sum test, Appendix Fig S2C). PMDs were enriched for lowly expressed and downregulated genes (Fig 2B, P = 2E-48, Fisher's exact test, Fig 2C), which can be rationalized by increased levels of repressive H3K9me3 and H3K27me3 histone marks (Fig 2D and Appendix Fig S2E). Regions with reduced transcriptional activity like the "B compartment" determined by Hi-C chromosome conformation capture (Fortin & Hansen, 2015) as well as lamina-associated domains (Guelen et al, 2008) were overrepresented in PMDs (Fig EV2C). In addition, active states (Appendix Fig S2F) and the H3K36me3 active transcription mark (Fig 2D) were depleted in PMDs, which were flanked by open chromatin (Fig EV2D). Our CTCF ChIP-seq data revealed an enrichment of CTCF binding at PMD boundaries, pointing to a potential function of CTCF to demarcate these regions and possibly limiting their further expansion (Fig 2D). Of note, the majority (75%) of somatic mutations in CLLs were located in the PMDs identified here (Fig 2E), consistent with the increased mutation rates in heterochromatin regions (Schuster-Bockler & Lehner, 2012). On the level of local meC changes, we identified 8,671 differentially methylated regions (DMRs) of which 8,669 were hypomethylated in CLL (Fig EV2E). In total, 7,932 DMRs (91%) overlapped in CLL with predicted enhancer chromatin states (1, 8, 9, and 11; Appendix Fig S2G). Open chromatin regions within these DMRs as detected by ATAC-seq were enriched in binding motifs for NFATC1, EGR, and E2A (Fig EV2F). Figure 2. Large partially methylated domains identified in CLL Left, example of a large PMD on chromosome 2 derived from a consensus of CLL samples (n = 11). Right, genome-wide quantification of PMDs across CLL samples (n = 11) and NBCs (n = 6). The PMDs mapped with this set of 11 CLL samples were used for further analysis in figure panels (A–E) in combination with the