Background. Plant homeodomain finger protein 20-like 1 (PHF20L1) is a protein reader involved in epigenetic regulation that binds monomethyl-lysine. An oncogenic function has been attributed to PHF20L1 but its role in breast cancer (BC) is not clear.
Objectives. To explore PHF20L1 promoter methylation and comprehensive bioinformatics analysis to improve understanding of the role of PHF20L1 in BC.
Materials and methods. Seventy-four BC samples and 16 control samples were converted using sodium bisulfite treatment and analyzed with methylation-specific polymerase chain reaction (PCR). Bioinformatic analysis was performed in the BC dataset using The Cancer Genome Atlas (TCGA) trough data visualized and interpreted in the MEXPRESS website. Methylation, gene expression and survival evaluation were performed with R v. 4.0.2 software. Using multiple bioinformatic tools, we conducted a search for genes co-expressed with PHF20L1, analyzed its ontology and predicted associated miRNAs and miRNA-PHF20L1 networks. The expression and prognostic value of PHF20L1 and co-expressed genes were analyzed.
Results. We found demethylation in PHF20L1 promoter in both BC samples and healthy tissues. Data mining with 241 patients demonstrated changes in methylation of promoter regions in basal-like and luminal A subtypes. Expression of the PHF20L1 gene had a negative correlation with methylation. Twelve genes were co-expressed. PHF20L1 is a target of miR96-5p, miR9-5p and miR182-5p, which are involved in proliferation and metastasis. PHF20L1 gene expression was not associated with overall survival (OS), or relapse-free survival (RFS), but was associated with distant metastasis-free survival (DMFS).
Conclusions. Our findings showed differences in methylation of PHF20L1 promoter region near TSS and upstream in BC subtypes; its overexpression impacted DMFS. We found that PHF20L1 is targeted by miR96-5p, miR9-5p and miR182-5p, which are involved in proliferation and metastasis, and regulates genes engaged in processes such as alternative splicing.
Key words: metastasis, hypomethylation, PHF20L1, miRNA, promoter
DNA mutations are the driving force for cancer initiation, progression and invasion. Nevertheless, accumulating evidence suggests that epigenetic modifications are also involved. In tumors at early stages, it is common to observe hypomethylation of DNA from tumor cells and hypermethylation of CpG islands of specific promoters, which has led to the suggestion that epigenetic dysregulation actually precedes tumor events before classic mutations.1 Histone acetylation is an essential key for epigenetic regulation. Histone acetyltransferases (HATs) are responsible for transferring an acetyl group from acetyl-Co-A to the ε-amino of histone lysine residues.2 They do not work in isolation but as part of a complex whose components are responsible for determining the lysine specificity.3 In vivo, HATs require coactivators that determine which lysine will be acetylated and play a key role in a variety of cellular functions thanks to their various domains.4, 5 PHF20L1 is part of the non-specific lethal (NLS) complex involved in histone acetylation and post-translational modification.6 Located in the nucleoplasm and plasma membrane, PHF20L1 has Tudor, MBT, Lys-rich, and zinc finger plant homeodomain (PHD) type domains (Uniprot KB-A8MW92). It is similar to the PHF20 homolog,7 with which it maintains 33% homology, especially in the second PHD domain of PHF20, which shares 73% identity. Currently, its role and regulation are being revealed. It participates in avoiding SOX2 proteolysis8 and regulates the degradation of methylated DNMT1.9 PHF20L1 is considered an oncogene10 and has an important function in breast cancer (BC), which suggests that PHF20L1 may have a role in cancer treatment.5 MicroRNAs (miRNAs) are short noncoding RNA that regulate the expression of target genes and are associated with tumorigenesis, invasion and metastasis. An miRNA can regulate multiple genes that participate in the same biological pathway.11
The availability of cancer multi-omics databases allows us to decipher the genomic drivers of cancer, and the emergence of user-friendly tools to analyze and visualize a bulk of data is crucial to achieve the full potential of these datasets. In this study, we examined the PHF20L1 promoter methylation through sodium bisulfite treatment and its participation in BC by analyzing expression in public gene datasets.
We investigated PHF20L1 methylation and gene expression with an emphasis on its relationship with co-expressed genes, its contribution to survival (independently and with co-expressed genes), miRNAs that target it, and its involvement in cancer.
Materials and methods
Tissues from 80 confirmed BC cases and 16 healthy adjacent fresh tissue controls were collected in Hospital la Raza (Mexico City, Mexico). The study was approved by institutional ethical committees for research La Raza Hospital, Mexican Social Security Institute (IMSS), Mexico City, Mexico, and informed consent was obtained from all patients. All clinical data were collected from medical records. The state of disease was obtained based on pathological report.
DNA from tissues was obtained with QIAamp DNA Micro Kit (Qiagen, Valencia USA). DNA concentration was measured using NanoDrop 8000 Spectrophotometer (Thermo Fisher Scientific, Waltham, USA).
Sodium bisulfite treatment and methylation-specific PCR (MSP)
DNA isolated from tissues was bisulfite-modified using EpiTect Bisulfite Kit (Qiagen, Frederick, USA) according to the manufacturer’s protocol as previously described.12 The CpG island from the promoter region was located using Eukaryotic Promoter Database tool (https://epd.epfl.ch/index.php). MSP primer pairs were designed using Methprimer software13 to detect bisulfite-induced changes affecting unmethylated (U) and methylated (M) alleles. Primer sequences are as follows: PHF20L1 (MF) 5’-TTAAGAATAATAAATAATGTTTTTCGT-3’; (MR) 5’-GTAACTCACGAAAATTAAACCCG-3’; (UF) 5’-AAGAATAATAAATAATGTTTTTTGT-3’; (UR) 5’-ATAACTCACAAAAATTAAACCCAAA-3’. The size of methylated polymerase chain reaction (PCR) products was 204 bp for methylated and 203 bp for unmethylated amplicon in PHF20L1. PCR for bisulfite-converted DNA was performed using EpiTect MSP Kit (Qiagen). Twenty nanograms of DNA, 10 µM of each primer and 2X Master mix MSP (Qiagen, Valencia USA) were combined in a final reaction volume of 10 µL. For methylated PHF20L1, cycle conditions were as follows: 95°C for 10 min, 1 cycle; 35 cycles (94°C for 15 s, 52°C for 30 s, 72°C for 30 s); and 72°C for 10 min, 1 cycle. For unmethylated PHF20L1, cycle conditions were as follows: 95°C for 10 min, 1 cycle; 35 cycles (94°C for 15 s, 50°C for 30 s, 72°C for 20 s); and 72°C for 10 min, 1 cycle. Each PCR assay included a methylation control, an unmethylated control and genomic DNA (EpiTect PCR Control DNA Set, Qiagen, USA). The PCR products were analyzed using 3.5% agarose gel electrophoresis.
Bioinformatic analysis of data
in breast invasive carcinoma
We assessed the gene expression and methylation of PHF20L1 in breast invasive carcinoma dataset using The Cancer Genome Atlas (TCGA) database (http://tcgaportal.org). Data were visualized and interpreted using MEXPRESS (https://www.mexpress.be/).
Methylation and RNA analysis
Methylation data of 561 BC samples were obtained through MEXPRESS for methylation assay.14 MEXPRESS is an online user-friendly tool for the visualization and interpretation of TCGA data to assess expression, DNA methylation, and clinical data, as well as the relationships among them.15 TCGA database was used for analyses of mRNA expression with R v. 4.0.2 (www.r-project.org).14 Mean and standard deviation of parameters were used as descriptive statistics. Because data did not show normal distribution, a generalized linear model (GLM) of gamma distribution error was used (test analogous to a one-way analysis of variance (ANOVA)) in addition to Kruskal–Wallis analysis. Afterward, Tukey and Dunnet tests were performed with the multcomp16 and FSA libraries.17 All graphs were made with ggplot218 and ggsignif libraries.19 An α value of 0.05 was used.
Using ONCOMINE (https://www.oncomine.org), we analyzed pair-wise gene expression correlation (correlation ≥0.60). Database for Annotation, Visualization and Integrated Discovery (DAVID) v. 6.8 was used to perform gene ontology (GO) function analysis of co-expressed genes. In the GO analysis, the categories included were cellular component (CC) and molecular function (MF).
miRNAs associated with PHF20L1 were predicted using miRDB (http://mirdb.org/index.html)20 and mirDIP (http://ophid.utoronto.ca/mirDIP/),21 an online tool that provides 152 million human miRNA-target interactions. Our search was limited to high confidence (integrated score ≥0.90). Furthermore, to obtain a miRNA-PHF20L1 network with co-expressed genes, we used miRNet (https://www.mirnet.ca.).22
We used R v. 4.0.2 software to determine survival probability with PHF20L1 gene expression levels between TCGA BC samples. The patients samples were divided in 2 cohorts according to an expression cutoff of 3 (obtained using median value) and analized using the R package named survival.23 Methylation compared to survival was evaluated using data only from cg with significant results in shorter-survival patients. To evaluate the global prognostic value of PHF20L1 co-expressed genes, we used Kaplan–Meier plotter (http://kmplot.com), an online database of microarray datasets that assesses the effect of genes on survival in 5143 breast samples among other cancers,24 and calculates hazard ratio (HR) with 95% confidence intervals (95% CIs) and log-rank p-values. Survival analyses play a central role in identifying potential genes as key genes and biomarkers.
The promoter methylation status of the PHF20L1 gene was examined in 74 sporadic BC tumors (6 patients did not have complete data) and 16 non-tumoral adjacent tissues from some of these same patients. The mean age of the patients was 54.1 ±11 years. The BC stage frequencies were as follows: stage II 39.0%, stage III 45.5% and stage IV 15.6%. The methylation assay using sodium bisulfite revealed no difference in PHF20L1 promoter methylation status between cancer stages or in comparison to healthy tissues; all samples were demethylated (Figure 1A). Through data mining of PHF20L1 gene in the TCGA database with MEXPRESS tool, we analyzed DNA methylation (Infinium HumanMethylation450 microarray) in 241 patients with complete data. Eight promoter probes were analyzed. There was no significant difference regarding tumor stage (stage I 18%, stage II 58.5%, stage III 21.2%, and stage IV 2.3%) or histology type (data from 824 patients were used for this analysis). When we looked for differences between methylation and PAM50 BC molecular classification, we found differences with 2 probes: cg5307234 and cg27342122, of which cg27342122 corresponded to the region we analyzed (Figure 1B). The luminal A subtype had more methylation and the basal-like subtype had less methylation in promoter region cg5307234 probe, while, the basal-like subtype showed hypomethylation in the promoter region of cg27342122 compared to the other subtypes (Figure 1C, Figure 1D).
HF20L1 gene expression
Due to the minimum amount of tissue for our analysis, it was not possible to obtain RNA. The PHF20L1 gene expression was checked in TCGA BC samples that had subtype information available. We found that the mRNA expression level of PHF20L1 was higher in the basal-like subtype (Figure 2A). In addition, a negative correlation between DNA methylation and PHF20L1 transcription was also observed (r = −0.19, p < 0.001) (Figure 2B).
Using the ONCOMINE database, we selected co-expression analysis in BC primary sites, using only female mRNA data. We found 12 genes co-expressed with PHF20L1 across the BC dataset with a correlation value of ≥0.6. Co-expressed genes were clustered through gene ontology analysis using DAVID. The enrichment GO terms considered were CC and MF ontologies (Table 1). In the CC ontology, we obtained 3 GO categories involved with nucleus (7 genes), nucleoplasm (5 genes) and coiled coil (5 genes). The other enrichment category MF comprised items related to alternative splicing (10 genes), splice variant (9 genes) and phosphoprotein (9 genes). Only clusters that had at least 5 genes were included.
miRNAs have a role in post-transcriptional regulation of gene expression which leads to targeting of mRNAs for degradation and/or inhibition of translation. Furthermore, one miRNA can co-regulate several genes. Using mirDIP (corroborated by miRDB), 190 miRNAs were predicted as regulators of PHF20L1 gene expression, but only miR-96-5p, miR-9-5p and miR-182-5p were obtained using a very high score class for prediction (score ≥0.90). After searching in ONCOMINE, we found 12 genes co-expressed with PHF20L1. Next, we made a network with the miRNet tool in BC tissues, setting the cutoff degree on 1. We found 1 node with 442 miRNAs; this network included miRNAs predicted by mirDIP, including 79 miRNAs involved in breast neoplasms and triple-negative breast carcinoma (Figure 3). Mirnetwork allowed us to observe that miRNAs predicted as PHF20L1 regulators participate in cell differentiation, cell cycle and apoptosis (p = 0.006), and mir9-5p and mir-182-5p are involved in triple-negative breast carcinoma BC (p = 0.02), and mir-96-5p in breast neoplasms (p = 0.02).
PHF20L1 prognosis in breast cancer
The prognostic value of PHF20L1 expression was examined with R software. The expression of PHF20L1 or the promoter methylation in cg5307234 and cg27342122 probe regions by subtype had no relation with overall survival (OS), but when we analyzed the methylation data from deceased patients alone, i.e., patients with shorter survival, we found that more hypomethylation of PHF20L1 was observed in the basal-like subtype, with respect to the luminal A subtype, and methylation between the luminal A and luminal B subtypes (p < 0.01) was associated with survival (Figure 4A). The Kaplan–Meier plotter platform was used to analyzed relapse-free survival (RFS) and distant metastasis-free survival (DMFS), which revealed that expression of PHF20L1 is related with DMFS (p = 0.02) (Figure 4B). Exploring the potential roles of genes co-expressed with PHF20L1 in OS, RFS and DMFS, we obtained Kaplan−Meier survival curves from the Kaplan–Meier plotter platform. We found that expression of ZNF407 and PIAS1 were related with OS. Expression of all genes except PHF20L1, STT3B, YLPM1, and MFF was associated with RFS of BC, and STT3B, PRKRA, ATG12, and PHF20L1 were associated with DMFS (Table 2).
Epigenetic readers contain diverse methyl–lysine binding motifs including PHD, chromo, Tudor, MBT, PWWP, Ank, BAH, WD40, ADD, and zn-CW domains.25 PHF20L1 is a reader that interacts with mono- and dimethylated lysine residues in H3K4, H4K20, H3K27, and DNMT1, due to its Tudor and PHD domains, and histone H4K16 acetylation, due to its MYST (MOF) domain.5, 26 For example, PHF20L1 is recruited to E2F-responsive promoters through pRb mono-methylated K810, which suggests that PHF20L1 could participate in cell cycle progression mediating transcriptional repression.27 PHF20L1 is overexpressed in the aggressive subtypes basal-like and luminal B, which have been strongly associated with shorter survival in patients with BC.10 Thus, this gene could be a critical tethering factor regulating molecular mechanisms through methylation signals on both DNA and histones.10 A growing body of evidence supports the epigenetic reprogramming of cancer cells as a key step in breast carcinogenesis. Teschendorff et al. found that genomic distribution of methylation is not random and is strongly enriched for binding sites of transcription factors specifying chromatin architecture.28 We found differential methylation in PHF20L1 promoter in the molecular subtypes basal-like and luminal A in 2 regions: -708 bp to TSS (cg5307234 probe) and -242 bp (cg27342122 probe). Basal-like, luminal A and luminal B subtypes have significant differences in methylation in the promoter region. The methylation pattern was different inasmuch as region -708 to TSS was nearly methylated with a β value mean of 0.95 for luminal A and B subtypes, while in the -242 region both subtypes and basal-like were hypomethylated (β value of 0.023 and 0.020, respectively). Usually, promoters have sites for transcription factor binding. With aid from the TF2DNA database (http://fiserlab.org/tf2dna_db//index.html) and JASPAR CORE 2020 (http://jaspar.genereg.net/), we found that transcription factor EC (TFEC) binds to the cg5307234 region (-708 bp) and participates in regulating multiple cellular processes including survival, growth and differentiation.29, 30 The cg27342122 region (-242 bp) has sites for binding of GATA3, FOXP2 and FOXP3 transcription factors. Of these, the transcription factor GATA3 is relevant for its role in determination of cell identity. GATA3 is expressed in mammary glands in the differentiated luminal epithelial cells.31 So, differences in methylation pattern may affect the binding of transcription factors deregulating PHF20L1 expression.
In our analysis, we found that PHF20L1 overexpression was not related to OS in the analysis by cancer subtype except when the comparison was made only with patients with shorter survival. Similarly, hypomethylation was correlated with survival in these patients in the basal-like and luminal A and B subtypes. The luminal B subtype is distinguished by a higher proliferative activity than luminal A and worse prognosis.32 When we analyzed survival without grouping by subtypes and including the co-expressed genes, we found that overexpression of PHF20L1, STT3B, PRKRA, and ATG12 was related to DMFS. We found that all genes overexpressed except STT3B, YLPM1, MFF, and PHF20L1 are related to RSF. Interestingly, PHF20L1 and many co-expressed genes are involved in key processes such as alternative splicing.
miRNAs have an important role in cellular regulation, and we found 3 miRNAs with a high probability to regulate PHF20L1: miR96-5p, miR9-5p and miR182-5p. miR96-5p may participate in epithelial–mesenchymal transition33; using miRNet, we found that this miRNA is involved in breast neoplasms. miR9-5p could enhance cancer stem cell-like traits of BC, but its role depends on the stage of BC, i.e., it could inhibit cell proliferation (tumor suppressor activity) or play an oncogenic role in metastasis.34 On the other hand, miR182-5p is a key oncogenic miRNA that promotes cell proliferation and metastasis35 and could be involved in epigenetic changes.36 Therefore, these miRNAs might have an important role in methylation and expression changes of the PHF20L1 gene contributing to its role in BC metastasis.
PHF20L1 is established as an important epigenetic reader whose loss could induce genome hypomethylation. For us, the use of public databases and bioinformatics tools was crucial to obtain a better picture of PHF20L1 interactions particularly with miRNAs, which in turn are involved in a complex regulatory network affecting transcription.
Our study has limitations. First, validation should be carried out both in vitro and in vivo to determine the clinical usefulness in patients with metastatic disease. The second limitation is the modest sample size for some analyses and the difference in our experimental approach to methylation analysis.
Our findings indicate that changes in methylation near TSS of the PHF20L1 gene may influence its expression in BC subtypes and that PHF20L1 gene overexpression affects distant metastasis-free survival in BC. Furthermore, the study suggests that miR96-5p, miR9-5p and miR182-5p target and regulate to PHF20L1. These results support participation of PHF20L1 in the metastasis process.
The database data supporting this research article is from previously reported studies and datasets, which have been cited. The data used to analyze with R software is available at the MEXPRESS website.