Article Text
Abstract
Introduction Machine learning in computer-assisted diagnostics improves sensitivity of image analysis and reduces time and effort for interpretation. Compared to standard mammograms, a thermal scan is easily scalable and is a safer screening tool. We evaluate the performance of Thermalytix (an automated thermographic screening algorithm) compared with other standard breast cancer screening modalities.
Methods A prospective multicentre study was conducted to assess the non-inferiority of sensitivity of Thermalytix (test device) to that of standard modalities in detecting malignancy in subjects who show possible symptoms of suspected breast cancer. Standard screening modalities and Thermalytix were obtained and interpreted independently in a blinded fashion. A receiver operating characteristic (ROC) curve was constructed to identify the best cut-off point, non-inferiority margin of ≥10% to demonstrate the non-inferiority.
Results We recruited 258 symptomatic women who first underwent a thermal scan, followed by mammogram and/or ultrasound. At Youden’s Index of ROC curve, the test device had a sensitivity of 82.5% (95% CI 73.2 to 91.9) and specificity of 80.5% (95% CI 75.0 to 86.1) as compared with diagnostic mammogram, which had sensitivity of 92% (95% CI 80.7 to 97.8) and specificity of 45.9% (95% CI 34.3 to 57.9) when BI-RADS 3 (Breast Imaging-Reporting and Data System) was considered as test-positive. The overall area under the curve (AUC) was 0.845. For women aged <45 years, the test device had a sensitivity and specificity of 87.0% (95% CI 66.4 to 97.2) and 80.6% (95% CI 72.9 to 86.9), respectively. For women aged ≥45 years, the sensitivity and specificity were 80.5% (95% CI 65.1 to 91.2) and 86.5% (95% CI 78.0 to 92.6, respectively).
Conclusion We evaluated Thermalytix, a new AI-based modality for detecting breast cancer. The high AUC in both women under 45 years and above 45 years shows the potential of Thermalytix to be a supplemental diagnostic modality for all ages. Further evaluation on larger sample size is needed.
Trial registration number CTRI/2017/10/0 10 115;
- breast tumours
- radiology & imaging
- public health
Data availability statement
All data relevant to the study are included in the article or uploaded as supplementary information. Data are available upon request
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
This was the first-ever systematic blinded study that prospectively compared a computer-assisted thermal technology with standard imaging modalities to detect breast abnormalities .
This is also the first-ever study evaluating a computer-assisted thermal technology to detect breast abnormalities in a study population that was not previously diagnosed with breast cancer through standard imaging modalities.
The study also gives an indication of the benefit of combining thermal imaging with breast ultrasound.
Women who screened positive on mammography/ultrasound were confirmed by biopsy, but lesions identified by Thermalytix but not reported on mammography as positive were excluded from biopsy, which may have impaired the sensitivity and specificity.
As the study was performed only on symptomatic women, a large-scale screening study needs to be undertaken to evaluate on asymptomatic population as well.
Introduction
Globally, 18 million new cases of cancers were recorded in 2018. Nearly 2.1 million (11.6%) cases were attributed to breast cancer alone,1 the leading cause of cancer-related death in women.1 Breast cancer among young women has been on the rise and is increasing cancer-related mortality despite advances in diagnosis and treatment.2 Breast cancers must be detected at an early stage to increase progression-free survival. Effective screening of women in their reproductive age group needs regular community surveillance for breast cancer, with tools that exceed the sensitivity or specificity of clinical or self-breast examination.3
A screening mammogram reduces breast cancer mortality by 23% but has major limitations in women with high breast density and young women. Women with dense breasts have higher risk for cancer4 and nearly a 18-fold risk for interval cancer.3 A screening ultrasound has higher sensitivity and specificity than a mammogram when performed on women with dense breasts. However, ultrasound requires skilled radiologists to perform the scans for arriving at any clinical decision.5 In certain countries, like India, the use of ultrasound is regulated (Pre-Conception and Pre-Natal Diagnostic Techniques Act, 1994) proving a major impediment to scaling up ultrasound for community screening. Artificial intelligence (AI) can alleviate subjectivity and need for expertise in interpretation of medical images.6 The introduction of machine learning (ML)-enabled digital mammograms and automated breast ultrasound devices have alleviated the technical burden of performing higher volumes of breast cancer screening, but the cost is prohibitive for adoption in larger programmes.7
Though thermography was introduced in the latter part of the 20th century as an adjunct to screening mammograms, it has not been fully adopted.8 Thermography along with an ML classifier, trained by giving extracted features, specifically corresponding to metabolic, physical, structural, symmetrical and vascular properties of the tumour could reduce the false-positive and false-negative rates.9
The first clinical study to evaluate the performance of thermography was done in 198010 in over 58 000 symptomatic women. Among these women, 1245 women diagnosed normal or benign by conventional screening exhibited questionable thermal anomaly using thermography.10 More than a third of these women developed cancer in the 5-year follow-up period.10 This study generated a spike in thermography-based breast cancer screening studies using primitive technology before being discredited and eventually seeing a lack of enthusiasm from practitioners as interpretations of thermal images suffered from subjectivity. Although approved as an adjunctive modality for breast cancer screening by Food and Drug Administration (FDA) in 1982, thermography was not widely accepted due to the complexity involved in interpreting thermograms with naked eye and resulting high false positivity. However, in recent years, infrared breast thermography is re-emerging as a modality with promising results for detecting breast cancer because of better thermal sensors, improved algorithms and reducing manual error by using AI.11
The sensitivity of thermography in detecting breast cancer in previous studies ranged from 53% to 100%. In the study that yielded 100% sensitivity, the study sample had prior mammography, ultrasonography (USG), Fine needle aspiration (FNA) and was scheduled for surgery, thus bringing in a bias.12 Similarly, the disease characteristics of most study samples were known as they included women with suspicious breast lesions identified by mammography or ultrasound or Magnetic resonance imaging (MRI),12–14 or for whom a diagnostic biopsy was recommended,15or diagnosed with breast cancer.16–18 Also, in some studies, the comparator test/reference standard tests were performed before thermography,8 which could result in an interpretation bias. In other studies, the order of testing was appropriate,that is, index test was performed before comparator test, but the study design involved retrospective incorporation of healthy and cancer cases only and excluding Breast Imaging-Reporting and Data System (BI-RADS) 2 3 cases19 or including only abnormal cases.20 In one of the studies, only thermography-positive tests were followed up with further investigations, while the journey of patients with normal thermogram was not mapped.21 In another study where thermography was performed before other tests, thermography had a high sensitivity of 95.24%, although in a small sample of 54 symptomatic women.22 To the best of our knowledge, this study addresses all these limitations.
In this study, we evaluated a novel AI-based thermography solution and compared it with standard-of-care tests in a multisite, prospective cohort of women with symptomatic breast lesions, where order of testing was followed. We hypothesised that using ML algorithms to analyse and interpret thermograms will result in a test that is non-inferior to mammography in general and superior to mammography in women with dense breasts (American College of Radiology categories C & D). The test device Thermalytix, a computer-assisted diagnostic (CAD) platform, was compared against mammography and combined with ultrasound for dense breasts. While most ML algorithms have been evaluated retrospectively in a non-clinical setting, this study would be the first of its kind study evaluating the novel ML algorithm in a prospective scientific study in a clinical setting.23
Methods
Study design and study population
A prospective two centre cohort study was conducted at cancer care centres, located in Bangalore, South India. The study was registered in Clinical Trial Registry. The Clinical study protocol approved by Ethics Committee is submitted as ‘online supplemental file'.
Supplemental material
At a tertiary care hospital, younger women with breast symptoms also seek care. Hence, we included women aged 18 years and above with a report of any one of the following breast symptoms: palpable lump, swelling or mass in breast; persistent breast pain or tenderness or increase in breast size that is unrelated to menstrual cycle; inverted or tender nipple; nipple that is painful, scaly or with discharge (clear or bloody); skin on breast that is dimpled, red, blotchy, prickly, itchy or has changed in the texture; thickening in or around the breast or underarm area or any other symptoms leading to suspicion of breast cancer.
Pregnant and/or lactating women or those who had undergone either lumpectomy or mastectomy earlier, or had chemotherapy for any cancer in the last 2 weeks or had active illness, psychological and/or pathological conditions that could interfere with study participation were excluded. All participants provided written informed consent for participation before performing the procedures in the study.
All women underwent a clinical breast examination before they presented to the radiology department to undergo a diagnostic mammogram and/or a breast ultrasound. All patients were approached by a clinical research staff member to seek consent for thermal imaging prior to the intended imaging procedure. Before thermal imaging, the upper body of the participant is cooled for around 15 minutes in a room, which was maintained between 22°C and 24°C using an external cooler. This precooling ensures accurate interpretation by the AI-based software, irrespective of the ambient temperature in the screening room.
Thermal images from the participant obtained from five different views were uploaded into the AI-based software system and results were generated. Any suspicious lesion as per clinician’s discretion was biopsied. Results of all standard screening tests and biopsy remained blinded till the end of study, to the technician obtaining the thermal image and also to the analytical team recording the Thermalytix score from the software.
Patient and public involvement statement
Patients and/or the public were not involved directly in the design, or conduct, or reporting or dissemination plans of this research. Patient consent was taken for enrolling them into the study, and information on the possibilities of publication of the study results was included in the informed consent form.
A brief overview of study device
Thermalytix is a CAD software built on AI algorithms to automate interpretation of thermal images to generate a report with quantitative scores. An earlier conducted retrospective study using Thermalytix for automated diagnosis based on breast thermal images had shown the feasibility of automated interpretation.24
The study device has an infrared camera connected to a laptop used to access the patented Thermalytix software (Patent numbers: US 9898817, US 10307141, US 10055542, US 9622698). The thermal image is a representation of temperature variations on the skin captured using an infrared camera with a resolution of 320×240 pixels and thermal sensitivity of 0.02 °C (degree centigrade).9 The software runs a pretrained AI-model to analyse breast thermal images and to provide a score that classifies a specific lesion as malignant or benign. The generated score is a real number between 0 and 1, indicating the probability of malignancy, with 0 being normal and 1 having a high probability of malignancy. Additional parameters such as thermal symmetry between the breasts, temperature and structure of blood vessels were also available in the report in addition to an overall score (algo2) and a separate vascularity score, to indicate any asymmetry in vascular structures between the breasts. Figure 1 represents the picture of the Thermalytix device.
Test device—Thermalytix and its accessories.
Outcomes and statistical analysis
The conclusions drawn from mammogram,USG, and the thermal AI score were evaluated against the histopathology results for sensitivity and specificity. Continuous variables are described using median, categorical variables using frequency and percentages. The sensitivity of Thermalytix and that of standard modalities were compared with assessing non-inferiority. The primary objective of non-inferiority was assessed by constructing a 95% CI around the difference in sensitivities (Thermalytix—standard modalities). If the lower limit of the 95% CI was greater than −10%, then non-inferiority was established. In women aged 45 years and above, we compared the score to mammography alone and in women younger than 45 years of age, we combined ultrasound and/or mammography. During the study, any subject with a score greater than an arbitrary cut-off point of 0.5 was considered as malignant. On study completion, we constructed a receiver operating characteristic (ROC) curve to identify the best cut-off point at Youden Index. ROC curve analysis on Thermalytix scores was performed using this threshold .
Monitoring
Both clinical study sites were monitored by an independent team of clinical monitors. All procedures were monitored for compliance to protocol and Good Clinical Practices.
Results
Between 21 September 2017 and 31 July 2018, 326 women who had symptoms on a clinical breast examination were consecutively recruited from two centres in Bangalore, Karnataka. Figure 2 provides an overview of the study flow.
Flowchart illustrating the number of women enrolled and included in the analysis. *In the study, 124 women underwent mammography; ˆ221 women underwent breast USG; and Thermalytix was performed on all 258 women. USG, Ultrasonography: Case Report Form (CRF), Fine Needle Aspiration Cytology (FNAC), False Positives (FP), False Negatives (FN), True Positives (TP), True Negative (TN)
Among all women recruited at the two centres during the study period, 68 women could not be included in the study analysis for the following reasons: 4 women did not meet inclusion criteria, 44 women were not included in the study as they did not have a diagnostic conclusion in the case report form, 5 women did not undergo a confirmatory test after an inconclusive imaging test, biopsy reports were unavailable for two women, and in 13 women, the thermal images obtained could not be used for interpretation due to insufficient cooling. Among the cohort of 258 women, 63 women (24.4%) were diagnosed with malignant breast cancer. Table 1 describes basic demographics and symptoms of all women by diagnosis (malignant or benign).
Study population characteristics based on final diagnosis
The median age of the cohort was 41 years with a range between 18 and 80 years. The median age among women diagnosed with cancer was 55 years (range: 26–78 years) compared with women without cancer, 40 years (range: 18–80 years), p<0.001. Detailed information on age categorisation on this cohort is provided in table 2. Among the cohort of 258 women, 149 women (57.8%) were aged <45 years, 92 women (35.7%) had attained menopause and 38 women (14.7%) had a family history of cancer. A palpable lump was the most common symptom on presentation in 71.7% of the study population followed by 57.0% women reporting of breast pain/tenderness unrelated to the menstrual cycle.
Age categorisation in recruited patients
In the entire cohort of 258 women, 124 (48.1%) women underwent diagnostic mammograms and 221 (85.7%) had a breast USG. Among them, only 67 women had a biopsy for the suspicious lesion detected either on mammogram, or USG or both. Four women were disease-negative on biopsy even though they had a positive scan report in USG/mammogram. Thirty-eight women with breast cancer (60.3%) had attained natural menopause in the disease-positive group compared with 54 women (27.7%) in disease-negative group (p<0.001). Other significant variables that differed between the disease-positive and disease-negative groups were lump, swelling or mass in breast, breast pain or tenderness unrelated to menstrual cycle (p<0.001). Among the 63 women who were diagnosed with malignant breast lesions, lesion size information was available for 53 women. Of these, 15 women had T1 lesions that measured 2 cm or less on a breast USG.
The diagnostic mammogram had a sensitivity of 92% (95% CI 80.7 to 97.8) and specificity of 45.9% (95% CI 34.3 to 57.9) assuming BI-RADS (Breast Imaging-Reporting and Data System) 3 as test positive (table 3). The initial sensitivity of the test device, with cut-off score of 0.5, for detecting malignancy, was 74.6% (95% CI 63.9 to 85.4) and specificity was 82.1% (95% CI 76.7 to 87.4). With the post hoc cut-off score of 0.41, the sensitivity of the test device was 82.5% (95% CI 73.2 to 91.9) and specificity was 80.5% (95% CI 75.0 to 86.1) (figure 3 and table 3).
ROC curve with sensitivity and specificity at a cut-off of >0.41 for Thermalytix score and the AUC was found to be 0.845. AUC, area under the curve; ROC, receiver operating characteristic.
Comprehensive table showing the results of mammogram and Thermalytix against biopsy findings
In women younger than 45 years of age, the sensitivity of test device was 87.0% (95% CI 66.4 to 97.2) and specificity was 80.6% (95% CI 72.9 to 86.9) at a cut-off score of 0.43 and the area under the ROC (AUROC) of 0.846. For women with age greater than or equal to 45 years, we observed a sensitivity of 80.5% (95% CI 65.1 to 91.2) and a specificity of 86.5% (95% CI 78.0 to 92.6) at a cut-off value of 0.41 and AUC was found to be 0.875 (figure 4). The test device was within the non-inferiority margin of 10% in sensitivity as compared with mammography.
ROC curves find optimal cut-off points of Thermalytix score for breast cancer diagnosis in women aged (A) 45 years or younger and (B) above 45 years. AUC, area under the curve; ROC, receiver operating characteristic.
Among 124 cases who underwent mammograms, 30 women were tagged as BI-RADS 0 (inconclusive). When they underwent an USG examination, three women were reported as positive and were confirmed as breast malignancies on biopsy. All these three breast malignant cases were also detected by Thermalytix as positive for malignancy.
In order to evaluate possible clinical workflows that can complement the gaps in current modalities, the sensitivity and specificity of a workflow where a person found positive on Thermalytix undergoes a breast ultrasound examination were computed. This workflow is analogous to standard practice followed in India, where a positive mammography is sent for a diagnostic correlation with an ultrasound. If the test device is used instead of mammogram to screen women before referring them for an ultrasound, the sensitivity of the combined modality would have been 81.0% and specificity would be 96.4%.
Discussion
Nearly, a quarter of the women who had symptomatic breast lesions had malignancy confirmed. The study was conducted in two centres both located in Bangalore, acity that has reported a recent spike in breast cancers among women.25 Thermalytix performed non-inferior to mammography . Its sensitivity or specificity was also similar to ultrasound in women aged 45 years or younger, the typical age group with dense breasts. There are various key strengths in this study. We prospectively screened women coming to a tertiary care hospital with a symptom and demonstrated a feasible workflow for the CAD to perform as an adjuvant to existing breast screening modalities. The study demonstrated the simplicity and scalability of the solution, as Thermalytix tests in both centres were conducted by low-skilled technicians with just 2 days of training. The technicians who captured the thermal images and uploaded them to the Thermalytix test device were blinded from results derived from the mammogram and/or the ultrasound which were conducted after Thermalytix test. The tool automatically computed the scores on the thermograms. We derived a new cut-off for differentiating malignant lesions from benign lesions through construction of ROC curve, which was based on the prospective study results. The new cut-off improved the sensitivity and negative-predictive value of CAD in breast cancer screening. Being a simple to use and easy to integrate device with automated reporting, the study gives confidence to adopt the same for any setting. It further restores faith in use of thermography for screening at greater frequency among women across all age groups.
Thermalytix also has the potential to be used as a supplement to mammography in cases of inconclusive diagnosis (eg, women with dense breasts). Thermalytix could provide a definitive diagnosis in women reported with BI-RADS 0 on mammography. Thus, based on this clinical study, Thermalytix could be recommended as a first-line modality for diagnostic screening. Those who are found positive by Thermalytix can be sent to an ultrasound test for confirmatory diagnosis before recommending for histopathology examination. This workflow complements the gap in current breast cancer screening procedures.
Our study did have certain limitations. Inability to include asymptomatic women seen during the study period may have affected our sensitivity or specificity. A predetermined stratum for recruiting symptomatic and asymptomatic women with dense breasts (American College of Radiology Category C and D- ACR category Cand D) could have increased their sample numbers and provided a better insight into utility in this group. We were limited to recommending a biopsy only in women with a suspicious lesion on mammography and/or ultrasound. This may have led to missing women with early malignant lesions that were not picked up on mammography. Further studies may have to be done to prove the true sensitivity and specificity for picking up early stage breast cancer.
Globally, the incidence of breast cancer is on the rise with breast cancer in women aged between 15 and 49 years in low and middle-income countries (LMICs) like India being two times than seen in developed countries.26 Predominantly, younger women have dense breasts in which a mammogram has low sensitivity (62% to 68%).27 In this study, the AUROC curve for subgroup of women aged 45 years or younger is 0.846 and is sensitive in picking up malignancy in symptomatic women.
For successful implementation of any population-based breast cancer screening programme, it is important for a healthcare system to provide an accurate diagnosis (benign vs malignant), for women with clinically detectable disease.28 Mammography-based screening programmes are expensive for developing and under-developed countries.29 According to the 2003 World Health Survey, only 2% of women aged 40–69 years had received any breast cancer screening in LMIC.30 This low yield of screening could be attributed to many factors including lack of accessible, affordable solutions to detect breast cancers at an early stage.31
Task sharing and task shifting initiatives adopted in many in LMICs employ low-skilled personnel to perform traditionally complex procedures.31 32 The focus on population screening in LMICs, currently, is to downstage the disease at presentation rather than establishing a mortality benefit.28 Compared with the use of mammograms and/or ultrasound, Thermalytix is a feasible and efficient system. A high-resolution thermal camera can be procured at US$10 000 as compared with a 2D mammogram that costs US$350 000.33 Not only does the Thermalytix device cost less than a 2D mammography machine, it also requires minimal infrastructure. It does not need radiation protection, is easily portable, weighs less than 3 Kilogram (kg) and can be accommodated in a cabin-size bag. It takes less than 15 minutes to complete the setting up of the screening area using a foldable tripod, a laptop housing the software to analyse thermal images and a printer to generate the quantitative report. Furthermore, the test can be performed easily by even low-skilled personnel. As the test is performed in a closed room/booth, and the technician does not see or touch the participant, the test ensures privacy and, thus, reduces the inhibition a woman may have in taking a test that assesses her private parts. This is a very important factor for adoption in a country like India.
Thermalytix is a solution that can be deployed even during an emergency such as the pandemic (COVID-19). This solution could provide screening in the population as the device is a no-see, no-touch, privacy aware solution. Even maintaining safe social distance between the operator and the woman seeking breast cancer screening is feasible, as the operator need not come in close contact to instruct the subject wanting to be screened.
Conclusion
While screening mammograms help to reduce mortality due to breast cancer, there are some limitations of the technique. A routine mammography is not feasible in a community setting due to its high cost, challenge of accessibility and lack of experienced professionals required to interpret the images. In this study, we evaluated the clinical performance of a new device called Thermalytix, a modality based on AI over thermal imaging, which showed higher sensitivity for detecting breast cancer in women who presented with symptomatic breast lesions. Thermalytix with high AUC of 0.846 and 0.875 for women <45 years and ≥45 years, respectively, is a promising screening tool among women in all age groups. Further studies with larger sample size are required to evaluate the efficacy and performance of Thermalytix for its definitive role in routine screening. New screening modalities like Thermalytix, a CAD platform that demonstrated non-inferior sensitivity and specificity to mammogram and ultrasound in women with symptomatic breast lesions are likely to be used more extensively in the near future.
Data availability statement
All data relevant to the study are included in the article or uploaded as supplementary information. Data are available upon request
Ethics statements
Patient consent for publication
Ethics approval
Ethical approval for the study was gained from the Narayana Health Medical Ethics Committee. Ethics committee approvals were obtained from both the institutions prior to the start of the study (EC approval number: NHH/MEC-CL-2017-466).
Acknowledgments
The authors like to thank the sponsor for financial support. Dr Geetha Manjunath and Dr Lakshmi Krishnan for reviewing and providing valuable suggestions and comments in shaping this article. We would also thank Srirang Ramamoorthy for the constant support in the trial and the patients who participated in the trial. We would like to thank Ms Mala Srivastava and her team from Nextvel Consulting for independent data management and coordination of the multi-site clinical study.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Contributors Conceptualisation,study design and overall content guarantor : Alben Sigamani. Clinical inputs and patient resources: Akshita Singh. Investigations (radiological imaging): Venkatraman Bhat and Sudhakar S. Statistical analysis: ArunKumar Namachivayam. Drafting of the manuscript: Charitha Gangadharan, Candice Pulchan. Accountable for all aspects of the work: all the authors have read and approved the final version for publication.
Funding The study was sponsored by Niramai Health Analytix which manufactures the device being tested in this study, Thermalytix. While the sponsors provided inputs to the study design that was submitted and approved later by an independent ethics committee, the study was conducted with no interference from the sponsors. Furthermore, representatives from the sponsor had only view-access to the manuscript before the submission. The writing of the manuscript was not influenced by the sponsors and only the authors were involved in the process of writing and publication. Award/grant number: MST2017
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.