Article Text
Abstract
Objective To evaluate the risk factors and construct a nomogram model for the prognosis of primary liver cancer in the elderly based on the data from the US SEER database.
Methods The latest data of patients with primary liver cancer were extracted from the SEER database using SEER*STAT software, and the required variables were included. The data were screened and then divided into a training cohort and a validation cohort. A nomogram model was constructed by screening the variables through univariate and multivariate Cox analysis. The C-Index, ROC and calibration curves were used for model evaluation.
Results A total of 10 824 eligible cases from 2004 to 2017 were extracted, among which, 7757 cases were included in the training cohort and 3247 in the validation cohort. The C-Index of the model was 0.747 (in the training cohort) and 0.773 (in the validation cohort). The 3-year area under the curve (AUCs) of the training and the validation cohorts were 0.760 and 0.750, and the 5-year AUCs of the two cohorts were 0.761 and 0.748. The calibration curves showed an ideal calibration of the constructed model.
Conclusions The nomogram model constructed followed by Cox regression analysis showed moderate calibration and discrimination property, and can provide reference to a certain extent for furture clinical application of primary liver cancer in the elderly.
- Cancer genetics
- Adult oncology
- Gene therapy
Data availability statement
Data are available on reasonable request. The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
STRENGTHS AND LIMITATIONS OF THIS STUDY
A large and sufficient number of elderly cases with liver cancer were collected from the SEER database.
A novel and ideal prognostic model was constructed for the elderly patients with liver cancer.
Selection bias might exist, because all the cases were retrived from the same database.
Some of the classifications carried out in the SEER database were not specific enough.
Information such as ancillary tests was absent from the SEER database.
Introduction
Primary liver cancer is currently the sixth most common cancer worldwide, and is the fourth-leading cause of cancer-related deaths globally according to epidemiological surveys, posing a major threat to the health of the entire human population.1 2 Furthermore, many studies have pointed out that although middle-aged (30–59 years) or young (<30 years old) patients with primary liver cancer are not uncommon worldwide, the average age of diagnosis of the disease is 60. In addition, in contrast to the yearly decrease of the age-standardised incidence rate (ASR) among young patients, the incidence in elderly patients has continuously increased in more than half of the countries and regions during the last 30 years.3–5 Global population expansion, increasing ageing, as well as obesity, diabetes, overmedication and lagging effects of HBV (Hepatitis B Virus) infection in the elderly may be responsible for the high or even increased ASR in elderly patients with primary liver cancer, imposing a heavy burden on the health sectors of all countries.6–8 Surgery remains the first choice for the treatment of primary liver cancer. Therefore, based on the epidemiological characteristics and treatment modalities of primary liver cancer, it is necessary to accurately assess the prognosis of the disease in elderly patients for the guide of clinical practice. However, different pathological types and heterogeneity of the disease still make its prognostic assessment difficult.
Recently, the nomogram model has gained widespread popularity due to its superior predictive performance over the traditional TNM (Tumor Node Metastasis.) staging in the aspects of its convenient modelling method and ability to incorporate multiple variables.9 10 This study intended to construct a nomogram model to analyse the risk factors of primary liver cancer in elderly patients base on the SEER (Surveillance, Epidemiology, and End Results) database and to predict the prognosis of the disease. The evaluation effect of the model was analysed by the test of discrimination and calibration, through which an optimal assessment system was established for the clinical practice such as the treatment of elderly patients with primary liver cancer.
Methods and data
Patient and public involvement
No patient involved.
Case selection
Case data of primary liver cancer with complete follow-up records were selected from the 2004–2017 SEER database (SEER research data, 18 Registries, November 2019 Sub (2000–2017)) using SEER*Stat V.8.3.6.
Inclusion criteria:
Ethnic groups are Asians, Pacific Islanders, American Indians and Alaskans.
The main site of primary liver cancer is liver or intrahepatic bile duct (IBD).
The histological types of primary liver cancer are intrahepatic cholangiocarcinoma (ICC), hepatocellular carcinoma (HCC) and associated liver cancer (combined hepatic carcinoma, CHC).
Exclusion criteria:
For patients under 65 years.
For incomplete follow-up records.
Non-tumor-related death.
Race, year of diagnosis, age, sex, primary site, histological type, grade, TNM stage, tumour size, surgery on the primary site (including photodynamic therapy, percutaneous ethanol injection and radiofrequency ablation, etc), survival time, cause of death and survival status were all extracted variables. Among them, patients over 65 years were selected; Asians and Pacific Islanders, American Indians and Alaskan natives were included as the race variable of Asians and others; liver or IBD was selected as the primary site; ICC, HCC and CHC were selected as the histological type.
Statistical processing
The survival endpoint and survival time were defined as 3 years and 5 years, separately. The statistical test is carried out by grouping different values as cut-off values through the ‘enumeration method’ using X-Tile software, and the result with the smallest p value can be considered as the best cut-off value. It was concluded that the variables of high, medium and low risks are divided into <46 mm, 46–81 mm and >81 mm, respectively. After that, all the cases were randomly assigned to a training or a validation cohort at a ratio of 7:3 using SPSS V.18.0 by random number 20200222, followed by the collection of baseline information. Univariate and multivariate (Forward: LR) Cox analyses were performed using the R software or SPSS to screen statistically significant variables for nomogram construction, based on which, C-Index, ROC curves and the area under the curve (AUC) were figured out. Calibration curves of the model for 3 and 5 years were plotted with the R software after Bootstrap sampling for 1000 times. A p<0.05 was considered statistically significant.
Results
Clinical characteristics of the cases
A total of 10 824 elderly cases with primary liver cancer were extracted in accordance with the screening conditions, including 7757 in the training cohort and 3247 in the validation one. Among them, the majority of patients were male (67.5%), white (71.3%), with primary site in the liver (87.8%), HCC (84.7%), grade II (46.6%), T1 (46.5%), N0 (91.6%), M0 (88.7%) and unoperated (56.5%) (table 1).
Baseline data of the extracted cases
Screening for prognostic risk factors.
Univariate Cox regression analysis was performed on the training cohort, and the variates of age, sex, race, histological type, grade, TNM stage, surgery and tumour size were proved to be statistically significant (p<0.05) and included in the follow-up multivariate Cox analysis. However, the primary site was excluded according to the analysis (p=0.232) (table 2). Subsequently, the variable of sex was further excluded from the experiment by Forward: LR multivariate Cox (table 3). In the end, age, race, histological type, grade, TNM stage, surgery and tumour size were all independent risk factors affecting the prognosis of elderly patients with primary liver cancer, and could be used for constructing nomogram prediction model.
Univariate Cox analysis
Multivariates Cox analysis
Nomogram model construction and verification
The 3-year and 5-year nomogram prediction model for primary liver cancer in the early were constructed based on the independent risk factors affecting the prognosis of the disease derived from the above analysis. The total score was calculated by aggregating the scores of each variable to predict the 3-year and 5- year survival rate of patients (figure 1). It can be seen that the most important factor affecting the score in this model was surgery on the primary site, followed by tumour size, TNM stage and age. The C-Index of the model was 0.747 (in the training cohort) and 0.773 (in the validation cohort). The AUC was calculated after plotting the ROC curves of the training and the validation cohorts. Specifically, the AUC is 0.760 (3 years) and 0.761 (5 years) in the training cohort, and 0.750 (3 years) and 0.748 (5 years) in the validation cohort (figure 2). Furthermore, the model showed an ideal calibration for 3-year and 5-year survival prediction in both groups after creating the calibration curves for the training and the validation cohorts (figure 3). By comapring the predictive value of the nomogram model with the TNM model, it was revealed that their 3-year AUC were 0.758 and 0.698 (p<0.05) separately, and their 5-year AUC were 0.750 and 0.609 (p<0.01), respectively (figure 4.).
Constructed nomogram. CHC, combined hepatic carcinoma; HCC, hepatocellular carcinoma; ICC, intrahepatic cholangiocarcinoma.
Three-year and 5-year survival ROC(Receiver Operating Characteristic) curves for the training and the validation cohorts. (A) 3-year survival ROC curve for the training cohort. (B) 5-year survival ROC curve for the training cohort. (C) 3-year survival ROC curve for the validation cohort. (D) 5-year survival ROC curve for the validation cohort. AUC, area under the curve; ROC, Receiver Operating Characteristic.
Three-year and 5-year survival calibration curves for the training and the validation cohorts. (A) 3-year survival calibration curve for the training cohort. (B) 5-year survival calibration curve for the training cohort. (C) 3-year survival calibration curve for the validation cohort. (D) 5-year survival calibration curve for the validation cohort.
The comparison of ROC (Receiver Operating Characteristic) between the nomogram model and the TNM (Tumor Node Metastasis) model. (A) 3-year nomogram model, (B) 5-year nomogram model). AUC, area under the curve.
Discussion
Analysis of cases revealed that male patients accounted for more than 60% of all the elderly patients with primary liver cancer. Some statistics have presented that the mean annual change rate of men suffering from the disease is higher than that of women (3.7% vs 2.7%) in the USA.11 In China, a population-based study of hepatic carcinoma in Zhejiang Province demonstrated that the ASR for hepatic carcinoma was 33.24 in men compared with 1.21 in women.12 Not only differences in lifestyle—including alcohol consumption and smoking—have led to higher cancer rates in men, but different physiological conditions such as hormone secretion and even genetic differences may be responsible for these epidemiological differences.13 Therefore, it has been proposed that gender is a critical biological variable that should be considered in all studies aimed at improving carcinoma.14 Analysis of baseline data also suggested that the population of elderly patients with primary liver cancer was predominantly white and mostly with the primary site in the liver, HCC histological type, grade II (moderately differentiated), T1 and without lymph node metastasis or distant metastasis. Moreover, in this population, more than half of the cases were not treated surgically. The possible reason for this phenomenon is that most of the patients were over 60 years at the time of diagnosis, missing the best time to receive radical surgery. In addition, in consideration of the decline in their physical function as well as intolerance to surgery, a palliative treatment was chosen for most of these patients.
Based on further univariate and multivariate Cox analyses, several independent risk factors affecting the prognosis of the disease were obtained, including age, race, histological type, grade, TNM stage, surgery and tumour size. Sex, though not negligible as previously mentioned, was not a main factor affecting prognosis in this population after comprehensive analysis, which is consistent with several current retrospective studies on hepatic carcinoma.15–17 Some clinical information affecting the operation, such as metastatic cancer, can be reflected in the TNM staging. In terms of histological types, the prognosis of CHC is obviously worse than that of the common HCC, with a lower incidence but a higher degree of malignancy.18 19 Analysis of the age factor revealed that the higher the age group of the patient, the worse the prognosis, suggesting a linear negative correlation trend. The nomogram model also indicated that surgery was the most crucial factor influencing the prognosis of the disease. Although just a small number of patients received liver transplantation, they showed a relatively good prognosis, followed by patients with resection or lobectomy and local destruction. In contrast, patients without surgery showed a relatively poor prognosis. This factor alone reduced the 3-year and 5-year predicted survival rates to less than 50%, suggesting that the invention of new methods or enhanced surgery is still urgent for improving the prognosis of elderly patients with primary liver cancer. The influence of other factors on the prognosis of the disease is basically in line with the current consensus that the worse the grade, the higher the T-stage, the occurrence of lymph node metastasis, the occurrence of distant metastasis and the larger the tumour and the worse the prognosis of the patients.
After that, the performance of the established model was evaluated by C-Index, ROC curves and calibration curves. A nomogram model is considered to have good discrimination if its C-Index and AUC exceed 0.7.20 21 As the two indicators of the model constructesd in this study were all above 0.7 in both the training and the validation cohorts and the calibration plots scattered in accordance with the reference line, it could be concluded that the model has good discrimination and calibration and hence the capacity to predict the prognosis of the disease.
However, this study also has shortcomings. First, the cases in this study were all from the US SEER (Surveillance, Epidemiology, and End Results) database, which is not representative for regions outside the USA and is subject to selection bias. In addition, the case data included in this database lacked some important ancillary tests related to the diagnosis and treatment of liver cancer, such as CEA, AST and vascular invasion. More importantly, the radiotherapy and chemotherapy information contained in this database can only be obtained by signing some agreements, which can not be obtained for the time being, so we are unable to study the relationship between radiotherapy, chemotherapy, targeted therapy and the prognosis of liver cancer.22
There are also deficiencies in our statistical conclusions. Limited by time and skills, our model did not reach an ideal state, and its AUC is only 0.75, indicating that there is still room for improvement. This affects the prediction accuracy to a certain extent and reduces the prediction credibility. In the future, we will continue to refine our nomogram model to make it achieve a more accurate degree.
In conclusion, a nomogram model with moderate prediction was developed by using the case data in the SEER database after performing univariate and multivariate Cox screening, which could provide reference for future diagnosis and treatment of elderly patients with primary liver cancer.
Data availability statement
Data are available on reasonable request. The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Ethics statements
Patient consent for publication
Ethics approval
Informed consent was not required from patients to obtain data from the US SEER database since cancer is publicly reportable in every state in the USA.
References
Footnotes
Contributors FL wrote and revised the manuscript; TZ conducted most of the analysis of data; XG reviewed the manuscript. XG had full access to all of the data in the study, took responsibility for the conduct of the study, the integrity of the data and the accuracy of the data analysis, and controlled the decision to publish. All authors read and approved the final manuscript.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.