File 1). For example, 29 of liver enhancer SP600125 chemical information predictions overlap chromatin marks and
File 1). For example, 29 of liver enhancer predictions overlap chromatin marks and ChromHMM enhancer predictions for the HepG2 hepatocellular carcinoma cell line, providing additional evidence for the tissue-specificity of the activity of the predicted enhancers (Figures 2C,D; Figure S8 in Additional file 1). Substantial overlap is also observed for other classifiers with DHSs (Figure S9 in Additional file 1). Finally, we found that enhancer predictions are significantly enriched in matching p300 embryonic brain, limb, and heart enhancers (2.5-fold enrichment, P-value <0.001, computed based on 1,000 randomized sequences genome-wide, [45,52]). Taken together, these observations are consistent with our promoter-based models being able to predict enhancers that drive specific expression of neighboring genes in different tissues.Experimental assays validate tissue-specific activity of promoter-based enhancer predictionsThe most reliable evidence for the accuracy of our promoter-based models in predicting tissue-specific enhancers is the experimental verification of their regulatory activity in vivo. Substantiated by the consistent results from the computational analysis, we chose to validate a subset of liver enhancer predictions in the loci of highly expressed liver genes using a mouse liver reporter assay [53,54]. We selected, as described in detail below, 12 out of the total of approximately 400 regions with predicted liver enhancer activity (Table 2) andTaher et al. Genome Biology 2013, 14:R117 http://genomebiology.com/2013/14/10/RPage 7 ofTable 2 In vivo assay of 12 liver enhancer PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/28914615 predictions in mouseID E1 E7 E2 E8 E12 E4 E5 E10 E3 E9 E11 EaCoordinates [hg18] Chr16:30009197-30009301 Chr10:82023332-82023434 Chr17:69962796-69962894 Chr1:31679030-31679149 Chr3:134934911-134935091 Chr11:72138832-72139119 Chr17:69957921-69958023 Chr17:69951076-69951329 Chr11:72162942-72163179 Chr11:72168912-72169046 Chr11:72166225-72166509 Chr17:17439720-Score 3.07 2.60 2.21 1.94 1.81 1.83 1.30 1.57 1.47 1.33 1.36 1.Location Intronic (TBX6) Intergenic (3′ UTR of MAT1A) Intergenic (4.5 kb downstream of GPRC5C) Intronic (SERINC2) Intronic (TF) Intronic (ARAP1) Intergenic (3 UTR of GPRC5C) Intronic (GPRC5C) Intronic (STARD10) Intronic (STARD10) Intronic (STARD10) Intergenic (4 kb upstream of PEMT)Activity Yes No No Yes Yes Yes Yes Yes No No Yes NoChromatin statea No Yes No Yes Yes No No Yes No Yes Yes NoOverlaps with ‘strong enhancer’ Chromatin State Segmentation by HMM from Broad Institute, MIT, and MGH in HepG2 cell lines. Enhancer predictions for which we observed in vivo activity in mouse liver are highlighted in bold.regions with no predicted activity as controls (Table 3) for functional testing. Importantly, we tried to ensure that the enhancer predictions tested were not significantly different from the whole set of predictions, and chose controls exclusively based on their score. Thus, differences between enhancer predictions and controls observed for other sequence properties simply reflect an association between high scores and the existence of functional constraints, rather than bias in the selection of the sequences. Liver enhancer predictions selected for validation had an average score of 1.79, and were distributed across the complete range of scores (Figure S10A in Additional file 1). Additionally, liver enhancer predictions selected for validation are located at an average distance to the nearest TSS of 7.3 kb, and are not significantly diffe.