Abstract
Background. Despite its excellent screening effectiveness and sensitivity for breast cancer (BC), digital breast tomosynthesis (DBT) is controversial due to its high radiation exposure and long reading time. This study examines the diagnostic accuracy of DBT and digital mammography (DM) for BC screening and diagnosis in women with dense or non-dense breast tissue.
Materials and methods. PRISMA-compliant searches were performed on Medline, Embase, PubMed, Web of Science, and the Cochrane databases for articles comparing DBT and DM for BC screening until March 2023. Meta-analysis was performed using RevMan sofware, and the Cochrane Risk of Bias Assessment Tool was employed to assess study quality.
Results. This meta-analysis included 11 trials with a total of 2,124,018 individuals. Screening with DBT resulted in a greater cancer detection rate, as demonstrated by a risk ratio (RR) of 1.27 (95% confidence interval (95% CI): 1.14–1.41). Digital breast tomosynthesis also had a reduced recall rate, with a RR of 0.88 (95% CI: 0.78–0.99), higher sensitivity and specificity values (pooled sensitivity of 0.91 (95% CI: 0.59–0.99)) and pooled specificity of 0.90 (95% CI: 0.42–1.0)) than DM (pooled sensitivity of 0.86 (95% CI: 0.52–1.0) and pooled specificity of 0.81 (95% CI: 0.12–1.0)). All acquired data exhibited reliability, lack of bias and statistical significance (p < 0.05).
Conclusions. Digital breast tomosynthesis is a more effective screening and diagnostic assessment tool for women with dense or non-dense breasts than DM in terms of incremental cancer detection, sensitivity and recall rate.
Key words: breast cancer, digital mammography, digital breast tomosynthesis, cancer detection rate, overall recall rate
Background
Breast cancer (BC) is widely prevalent among women and is the primary cause of cancer-associated mortality in the global female population.1 Numerous countries have implemented population-wide BC screening, originally with X-ray-based film-screen technology, before transitioning to digital mammography (DM), with the objective of reducing BC mortality through early detection.2 Mammography, also known as screen-film mammography (SFM), is the most common breast imaging modality and is widely regarded as the gold standard for verifying or ruling out the existence of breast cancer. Compression of the breast is an essential component of mammography that employs X-ray technology to investigate the breast. Nevertheless, DM exhibits considerable sensitivity, with estimates ranging from 67.3% to 93.3%.3
Mammography findings are summarized and classified into separate categories using the standardized Breast Imaging Reporting and Data System (BI-RADS). Mammographic breast tissue densities greater than 50% fall into BI-RADS categories 3 or 4, or C or D, in the 4th and 5th editions, respectively. Such high density may have a masking effect, reducing the sensitivity of mammography. As dense parenchyma overlaps fibro glandular tissue, it may affect the mammographic identification of lesions and it may increases false-positive outcomes..4, 5
Breast density is a distinct risk factor for BC, and approx. 50% of women participating in screening are believed to have dense breast tissue. However, the proportion of dense breast tissue varies across different age groups. There is a positive correlation between high mammographic density, characterized by heterogeneously or excessively dense breast tissue and elevated susceptibility to BC, an association that extends to interstitial BC.6, 7
Digital breast tomosynthesis (DBT) is a medical imaging technique that generates reconstructed, nearly 3-dimensional (3D) mammographic images of the breast and is thought to enhance cancer detection during screening by offering improved visualization of lesions that may be difficult to identify on traditional 2-dimensional (2D) DM. This is particularly relevant in cases where dense or overlapping breast tissue may obscure the presence of such lesions.8, 9 Furthermore, DBT has the potential to decrease the occurrence of cancer-simulating artifacts caused by overlapping breast tissue, which may reduce the initial high rates of recalling patients for additional examination.10 Digital breast tomosynthesis allows for the acquisition of pseudo-3D images of the breast, leading to enhanced differentiation of tissue features and, perhaps, enhanced visualization of cancerous lesions. Therefore, it can be argued that DBT has the capacity to enhance the sensitivity and specificity of imaging in BC screening, resulting in a higher number of accurately identified tumors while minimizing false positive results.11
Several prospective and retrospective studies have investigated different screening populations and have consistently shown improved screening accuracy when DBT is employed.12, 13, 14 However, some studies have indicated that the combined use of DM and DBT leads to increased radiation exposure to the breast.15, 16
Objectives
Since there has been limited research comparing the diagnostic accuracy and reliability of DBT and DM for BC screening in women with dense or non-dense breast tissue, the primary aim of this study was to systematically evaluate and meta-analyze selected studies.17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27
Materials and methods
The current investigation followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines.28
Eligibility criteria
This study analyzed the comparative outcomes of relevant publications between 2015 and 2023, with priority given to incorporating full-text articles into the investigation. The inclusion criteria were studies: 1) reporting the screening of BC using DBT or DM, 2) involving dense and non-dense breast tissues, 3) including patients older than 18 years, and 4) published in English. In the meta-analysis, only abstracts with sufficient information were included. The analysis excluded studies with insufficient data, those extraneous to BC screening, and published before 2015.
Information sources
The researchers conducted an extensive examination of the academic literature using PubMed, Embase, Web of Science, Scopus, and Cochrane Library databases. The search methodology combined Medical Subject Headings (MeSH) and textual keywords using the Boolean operator “AND”.
Search strategy
A comprehensive and systematic review of relevant studies on the diagnostic accuracy and reliability of DBT compared to DM was conducted using PubMed and the Cochrane Library databases, following the PRISMA guidelines. To find relevant studies, we searched the medical literature for the following terms: breast cancer, digital mammography, mammography, DBT, cancer detection rate (CDR), overall recall rate, sensitivity, specificity, dense breast tissue, non-dense breast tissue, systematic review, and meta-analysis.
Selection process
Two authors, H.L. and Y.Z., thoroughly examined the pertinent literature to identify relevant articles. The researchers used inclusion criteria to exclude outdated references and incorporate relevant studies of importance.
Data collection process
Two researchers (H.L. and Y.Z.) carried out a thorough bibliographic search to find pertinent and significant works. A methodical selection approach was used to find and incorporate all relevant studies published between 2015 and 2023.
Data items
Two other authors, Y.W. and C.Y., summarized brief characteristics of the participants in the included studies and event data separately from the studies included in the analysis. The 4 key metrics discussed were: 1) CDR – the proportion of cancer cases correctly identified by a diagnostic test; 2) “overall recall rates” – the percentage of individuals called back for further testing after an initial screening; 3) “sensitivity” – the ability of a test to correctly identify individuals with BC; and 4) “specificity” – the ability of a test to correctly identify individuals without BC.
Risk of bias assessment
The assessment of potential bias in the research included in the study was undertaken using a previously established standardized questionnaire (Supplementary Table 1). A summary and graphical representation of the risk of bias was generated using the Cochrane Risk of Bias (Robvis) tool.31
Effect measures
H.L. and Y.Z. conducted independent evaluations of the methodological validity of the studies included in the analysis. L.W. assumed the responsibility of resolving any problems that emerged between H.L. and Y.Z. The determination was made based on the heterogeneity of the included trials. The Cochran’s Q statistic and the I2 index were employed in a random bivariate mode29 as part of the investigation of heterogeneity. The research was conducted using the RevMan v. 5 software (Cochrane Collaboration, Copenhagen, Denmark).30 Various other factors contributing to variability were examined, including the employment of full-text articles instead of abstracts, discrepancies in age groups and sample sizes, variations in the techniques used, and differences in the study outcomes.
Statistical analyses
The meta-analysis employed RevMan software v. 5. Since the studies were conducted under different conditions, a random effect model was used. The primary methodology employed in this research was the Mantel–Haenszel process, incorporating random bivariate effects. Statistical metrics, including odds ratio (OR), risk ratio (RR), sensitivity, and specificity, together with a 95% confidence interval (95% CI), were mostly computed using the Mantel–Haenszel method. Assessing the number of standard deviations by which a value deviated from the mean used z-test statistics, and p < 0.05 was considered statistically significant. Moreover, forest plots were made to visually represent the results, and tau2, χ2, I2, and z-values measured heterogeneity in the publications evaluated. The diagnostic OR was calculated using a 2×2 contingency table and the DerSimonian and Laird method.32
The assessment of publication bias used Begg’s test,33 Egger’s test34 and Deek’s funnel plots.35 Deek’s funnel plot was generated by plotting the natural logarithm of the OR for each publication against its corresponding standard error using MedCalc software (MedCalc Software, Ostend, Belgium).36 The development of Youden plots37 and hierarchical summary receiver operating characteristic curves (HSROCs)38 aimed to evaluate the degree of inter-study variability.
Results
Literature search results
The application of the PRISMA flowchart for selecting research studies is illustrated in Figure 1. After conducting a thorough analysis of online sources, a collection of 347 academic papers was identified. Following the removal of duplicate submissions, 241 studies were screened based on their abstracts and titles. A total of 136 papers that satisfied the predetermined inclusion criteria were comprehensively evaluated. The current meta-analysis comprised 11 publications selected based on predetermined inclusion and exclusion criteria. The studies incorporated in the analysis investigated and assessed the diagnostic precision and dependability of DBT and DM in the context of screening for BC in women with dense and non-dense breast tissue. Table 1 presents a comprehensive overview of the pertinent characteristics of the research being examined. This study encompasses various attributes, such as the identification of studies, publication years, journals of publication, countries where the studies were conducted, interventions employed, screening intervals for mammographic density, total number of participants, patient age, sample size, and the instruments used for DBT and DM.
Evaluating overall study quality
Table 2 presents a comprehensive assessment of the methodological rigor and overall quality of the studies incorporated in the meta-analysis. Figure 2 gives a succinct overview of the potential for bias, and Figure 3 visually represents the danger of bias. Out of the 11 studies, 6 had a low risk of bias as they employed valid methodology for patient allocation to alternative treatments, maintained a low attrition rate, and implemented suitable measures to prevent bias, assess outcomes, analyze data, and report findings. As a consequence, the reported results are valid, and there was no selection bias, performance bias, detection bias, attrition bias, or reporting bias. However, 4 studies displayed a moderate risk of bias as a result of concerns regarding random sequence creation, allocation concealment and blinding of participants and staff. The remaining study carried a high risk of bias and allocation concealment. As indicated by the symmetrical funnel plot39 and the lack of statistical significance (p > 0.05) in Begg’s (p = 0.354) and Egger’s tests (p = 0.224),40 the results presented in Figure 4 indicate a low probability of publication bias.
Primary outcome statistical analysis
The current meta-analysis comprised a sample of 11 studies, either prospective or retrospective in nature, with a total of 1,110,194 participants. A total of 339,606 people underwent screening using the DBT method, while 770,588 received DM screening. The key outcomes of the studies were statistically analyzed to compare DBT and DM for BC screening in women with dense or non-dense breast tissue.
Cancer detection rate of DBT vs DM
Figure 5 illustrates 11 studies that reported CDRs, with a combined total of 1,286,449 people screened with DBT and 837,569 participants assessed through DM. The DBT group exhibited higher accuracy in detecting cancer (RR = 1.27, 95% CI: 1.14–1.41). The findings exhibited heterogeneity, as shown by the values of tau2 = 0.02, χ2 = 205.63, degrees of freedom (df) = 10, z = 4.36, I2 = 95 %, and p < 0.001 (Figure 5A). Similarly, DBT had a higher chance of detecting BC than DM (OR = 2.29, 95% CI: 1.49–3.51). The findings exhibited heterogeneity, as shown by the tau2 = 0.38, χ2 = 35.48, df = 11, z = 3.78, I2 = 69 %, and p < 0.001 values (Figure 5B).
Overall recall rate of DBT vs DM
Figure 6 illustrates the results of 11 studies that reported an overall recall rate. The sample consisted of 1,286,449 participants tested using DBT, and 837,569 people screened using DM. The study revealed that the DM group had a greater recall rate than the DBT group (RR = 0.88, 95% CI: 0.78–0.99). The findings exhibited heterogeneity, as indicated by the tau2 (0.02), χ2 (67.89), df (10), z (2.16), I2 (85%), and p (< 0.001) values (Figure 6A). Similarly, the OR of 1.24 (95% CI: 1.01–1.5; tau2 = 0.08, χ2 = 28.06, df = 11, z = 2.01, I2 = 61%, and p < 0.001) showed that the DM group had a greater recall rate than the DBT group (Figure 6B).
Sensitivity and specificity of DBT and DM
Imaging instruments used for BC screening must have high sensitivity and specificity41 to accurately detect the presence or absence of BC. Using the dataset extracted from the 11 included studies, we determined the sensitivity and specificity of DBT and DM. In Figure 7A, the data indicate that DBT exhibited a pooled sensitivity of 0.91, with a 95% CI ranging from 0.59 to 0.99. Additionally, the pooled specificity for DBT was 0.90, with a 95% CI of 0.42 to 1.0. Conversely, Figure 7B presents the findings for DM, revealing an overall sensitivity of 0.86, with a 95% CI ranging from 0.52 to 1.0. The pooled specificity for DM was 0.81, with a 95% CI of 0.12 to 1.0. We found that DBT exhibited greater sensitivity and specificity than DM in detecting BC.
Evaluation of DBT and DM screening results for accuracy and quality
To evaluate the diagnostic precision of the DBT and DM screening tools, an HSROC was generated for both using the sensitivity and specificity data derived from the 11 studies included in the analysis (Figure 8). Figure 8A depicts the HSROC curve for DBT, whereas Figure 8B illustrates the HSROC curve for DM. The circular symbols in the diagram represent individual studies, with the size of each circle corresponding to the number of patients included in that particular study. The height of the ovals represents the number of patients with BC, while the width represents the number of patients without BC. Additionally, the diagram includes a 95% prediction region. Analysis of the curves revealed that DBT exhibited higher accuracy, pooled sensitivity and specificity than DM, even when considering the presence of inter-study heterogeneity.
Variations in screening outcomes can occur during the implementation of DBT and DM due to the use of distinct devices, instruments and processes. Furthermore, the degree of control over factors that influence the magnitude of the results is constrained. Therefore, it is crucial to take into account the impact of these numerous stochastic, uncontrollable variables when interpreting and assessing the results. Hence, for the purpose of quality control and identification of measurement bias in the incorporated studies, the Youden plots, which are designed for interlaboratory comparisons, were also constructed. The Youden index (YI)42 was computed using the sensitivity and specificity data obtained from the 11 studies incorporated in the analysis to evaluate the BC screening capability of the diagnostic tests. The findings indicated that DBT exhibited higher diagnostic accuracy than DM, as evidenced by DBT’s YI of 81% and DM’s YI of 67%, which are illustrated in Figure 9, where Figure 9A and Figure 9B represent DBT and DM, respectively. A lack of bias in these diagrams is attributable to the closely matched datasets and ensures that the results are reliable and accurate.
Discussion
Mammography is an X-ray imaging technique used to assess the breast to identify cancer and other disorders early and for diagnostic and screening purposes. Digital mammography is a system in which the X-ray film utilized in SFM is substituted by solid-state detectors that convert X-rays into electrical impulses, similar to those used in digital cameras.43 The European Society of Breast Imaging (EUSOBI) has issued its latest guidelines for the screening of women with highly dense breasts, as they are almost twice as likely to develop BC than a woman with normal breasts. Concurrently, the effectiveness of mammography is diminished due to the concealment of malignancies by the excessive projection of fibroglandular breast tissue. According to the EUSOBI guidelines set in 2022, it is strongly advised to do regular MRI screening exams every 2–3 years for individuals with breast composition type D, as defined by the American College of Radiology (ACR).44 Also, DM is more expensive than traditional film technology and has lower spatial resolution. To address these limitations, DBT, a technology that captures numerous pictures of the breast rather than the customary single 2D image acquired with traditional mammography, is currently being used.45 Digital breast tomosynthesis produces a more detailed picture and eliminates the problem of overlapping fibroglandular breast tissue that can disguise BC or imitate a pseudo-tumor, potentially enhancing the sensitivity for identifying breast malignancies and lowering the false positive rate.46, 47 Tomosynthesis, on the other hand, requires higher levels of radiation exposure and prolonged reading time.48 The radiation doses employed for each test vary, though current technologies employ minimal radiation doses to obtain breast X-rays that exhibit superior image quality. The mean cumulative radiation dose for a standard mammography, which includes 2 views of each breast, is around 0.4 millisieverts (mSv). Digital breast tomosynthesis was linked to a radiation dosage that ranged from much lower to somewhat higher than DM. Specifically, the dose ratio ranges were 0.34–1.0 for 1-view DBT and 0.68–1.17 for 2-view DBT.49
The objective of this meta-analysis was to evaluate the diagnostic precision and dependability of DBT compared to DM for BC screening in women with either dense or non-dense breast tissue and included 11 trials encompassing 2,124,018 individuals. The study revealed that the DBT resulted in a higher CDR, as shown by an RR of 1.27 (95% CI: 1.14–1.41). Additionally, DBT demonstrated a lower recall rate, with an RR of 0.88 (95% CI: 0.78 –0.99). The sensitivity and specificity of DBT were greater than those of DM. The pooled sensitivity for DBT was 0.91 (95% CI: 0.59–0.99) and the pooled specificity was 0.90 (95% CI: 0.42–1.0). In contrast, the pooled sensitivity for DM was 0.86 (95% CI: 0.52–1.0) and the pooled specificity was 0.81 (95% CI: 0.12–1.0). These differences in sensitivity and specificity between DBT and DM were statistically significant (Mantel–Haenszel method, z = 2.53; p < 0.001 for DBT and z = 2.37, p < 0.001 for DM).
The diagnostic accuracy of DBT was shown to be considerably superior to DM, as evidenced by the higher YI values of 81% and 67% for DBT and DM, respectively. All of the obtained data exhibited reliability, lack of bias and statistical significance, indicated by a p-value of less than 0.05. The findings of our study are consistent with a previous systematic review and meta-analysis that examined the effectiveness of DBT and DM. In research conducted by Phi et al. in 2018,50 it was shown that DBT had a high CDR (RR = 1.16, 95% CI: 1.02–1.31) and sensitivity (ranging from 84% to 90%) in women with mammographically dense breasts. Similarly, a study conducted by Li et al.51 revealed that DBT exhibited varying levels of increased cancer detection (1/1,000 screens, 95% CI: 0.3–1.6, p = 0.003) and recall rates influenced by breast density (–0.9%, 95% CI: –1.4% to –0.4%, p < 0.001). In their systematic review and meta-analysis, Alabousi et al.52 examined the performance of DBT, synthetic mammography (SM) and DM in the context of BC screening. They concluded that DBT alone or in conjunction with DM yielded optimal outcomes for BC screening.
The findings of this study demonstrate enhanced diagnostic outcomes when utilizing DBT in conjunction with Synthetic 2D (s2D) imaging compared to using DM alone. These results underscore the significance of incorporating DBT into BC screening practices. Nevertheless, it is essential to acknowledge that more research with longer observation periods and many screening iterations is necessary to develop definitive conclusions regarding the influence of enhanced detection of cancer on periodic rates of cancer and, perhaps, on BC mortality.
Limitations
This meta-analysis had several limitations. First, the inclusion of only 11 retrospective or prospective studies with moderate-to-high levels of heterogeneity limited the findings despite the study’s strict adherence to the recommended methodological rigor. Second, the studies included in the analysis solely focused on the assessment of initial detection measures, neglecting to provide any insights into the potential long-term health consequences associated with DBT screening. Hence, the potential impact of DBT on reducing BC mortality through incremental screening remains unknown. Furthermore, a significant portion of the data presented pertains to the screening of prevalent cases of DBT at the first stage. It is probable that variations may arise in the screening outcomes acquired through using diverse devices, equipment and processes when employing DBT and DM for screening. As a result, it is plausible that the findings of our study may have limited generalizability. In addition, the fact that only English-language articles were included may have limited the scope of our meta-analysis. Lastly, it should be noted that the small number of studies and patient populations included in this analysis limits the generalizability of the findings to a larger population. Consequently, additional research is necessary to investigate this issue further.
Conclusions
The present meta-analysis offers an up-to-date comparison of the DBT and DM screening techniques, with the results suggesting that DBT exhibits superior performance compared to DM in terms of increased cancer detection, sensitivity and recall rate in screening and diagnostic scenarios. The potential improvement in CDR and reduction in missed diagnoses (recall rate) associated with DBT may indicate a more effective approach to screening or diagnostic assessment for women with dense and non-dense breast tissue. Hence, the findings presented in our study have the potential to contribute to screening policy development, research planning and individual screening recommendations. However, it is crucial to note that further studies with extended follow-up periods and multiple screening rounds are required to establish conclusive findings regarding the impact of improved cancer detection on interval cancer rates and, potentially, on BC mortality.
Supplementary data
The Supplementary materials are available at https://doi.org/10.5281/zenodo.10803079. The package includes the following files:
Supplementary Table 1: Standardized questionnaire for assessment of risk of bias of included studies.