Article Text
Abstract
Purpose Electronic health records (EHR) passively generate large datasets on real-world patient populations in easily retrievable form, allowing the cost-efficient and timely execution of epidemiological cohorts on a broad array of topics. However, EHR-based cohorts specialising in mental disorders have not yet been reported. Ningbo has made significant achievements in healthcare data management in China. This study, relying on the Ningbo Mental Health Information System and the Ningbo Regional Health Information Platform, has established the Ningbo Schizophrenia Cohort (NSC), providing an exemplary study for cohort studies on schizophrenia.
Participants This population-based ambispective cohort study included patients with schizophrenia aged 18–65 years at the time of diagnosis who were eligible for healthcare services in Ningbo, China. Participants were identified using the Ningbo Mental Health Information System between 1 January 2010, and 31 December 2023. Once an individual enters the NSC, they are followed up continuously until death or relocation. A total of 26 899 patients with schizophrenia are included in the NSC.
Findings to date Among 26 899 patients, 55.4% were female and 53.1% had less than 7 years of education. Until 31 December 2023, 4505 deaths occurred, and 97.83% of patients had at least one electronic medical record. The median age at diagnosis for non-survivors (median (IQR): 40 (29–51) years) was higher than that of survivors (median (IQR): 34 (26–45) years).
Future plans The NSC will continue to collect longitudinal data to capture the full life cycle of schizophrenia, including pre-onset, diagnosis, follow-up, recovery or death. This will result in a continuous, complete and multidimensional EHR for patients with schizophrenia. Planned future research aims to generate new real-world evidence on the aetiology of schizophrenia, investigate comorbidities to facilitate co-management and develop predictive models for schizophrenia and related cardiovascular diseases.
Trial registration number NCT06370793.
- Schizophrenia & psychotic disorders
- PUBLIC HEALTH
- MENTAL HEALTH
Data availability statement
Data may be obtained from a third party and are not publicly available. The NSC was developed by the Affiliated Kangning Hospital of Ningbo University, which was responsible for granting access to and using patient data. Access to data resources requires completion of a proposal that includes the proposed research programme and the primary source of funding. All research must be ethically approved, and the feasibility as well as compliance of the research programme with the data-access policy must be assessed. Interested researchers are encouraged to contact the principal investigator, Professor Liya Liu (liuliya@nbu.edu.cn).
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
STRENGTHS AND LIMITATIONS OF THIS STUDY
Ningbo Schizophrenia Cohort (NSC) provides ambispective, observational and multicentre cohort data collected from a population of patients with schizophrenia.
NSC is fully linked to electronic health records, and its study design involves collecting data across the full life cycle of patients.
NSC has a large sample size and broad coverage, collecting data from most patients with schizophrenia in Ningbo.
NSC may have potential information bias in this study.
Introduction
The 2021 Global Burden of Diseases, Injuries and Risk Factors Study revealed that 13.9% of the global population has experienced mental disorders, which are among the top 10 causes of this global burden.1 The estimated number of cases of mental disorders rose from 654.8 million in 1990 to 1.1 billion in 2021, exhibiting a 68% increase.2 3 Among non-communicable diseases (NCDs), mental disorders account for 28% of disability-adjusted life years and have a greater impact on health than cardiovascular disease (CVD) or cancer.4 5 Schizophrenia is a chronic disorder characterised by positive symptoms (hallucinations and delusions), negative symptoms (blunted emotional responses and impaired motivation) and cognitive impairments.6 The cause of the disease remains unclear, with about 23.2 million people worldwide suffering from schizophrenia, which indicates a prevalence rate of about 0.28%.7 In China, the weighted lifetime prevalence of schizophrenia is 0.7%.8 Patients with schizophrenia have a poor prognosis and their mortality rate is 2–3 times higher than that of the general population.9 Early diagnosis and intervention in patients is important to improve prognosis,10 which requires urgent improvement.
While classic prospective cohort studies, such as the Framingham Heart Study,11 have pioneered modern aetiological research, building and maintaining a large cohort is costly and challenging.12 Electronic health records (EHR) provide an important source of real-world data that can address various research questions, including disease aetiology, characterisation and prognosis evaluation.13 The extensive data within EHR systems offer a cost-effective and rich resource of structured and unstructured healthcare information.14 15 EHR-based cohorts have been established to cover a wide range of research areas on NCDs, including chronic kidney disease,16 chronic cough17 and CVD.18 19 However, schizophrenia as an important chronic disease has not been reported in EHR-based specialty cohort studies. The past 3 years have witnessed a gradual increase in the reporting of studies such as risk factor identification and diagnostic/prognostic modelling for mental disorders based on EHR.20 21 This indicates the feasibility and validity of generating high-quality evidence in the field of mental disorders research based on EHR. Therefore, we use the strengths of the Ningbo Mental Health Information System (NMHIS) and Ningbo Regional Health Information Platform (NRHIP) to conduct a cohort study of patients with schizophrenia in China.
The Ningbo Schizophrenia Cohort (NSC) is a specialised EHR-based cohort of patients with schizophrenia that can provide real-world evidence to infer aetiology and enhance patient prognosis. It is also useful for improving mental health services and enhancing patients’ quality of life by obtaining information such as health service utilisation. The utilisation of NSC includes but is not limited to (1) understanding the epidemiological characteristics of schizophrenia, (2) inferring the aetiology of schizophrenia, (3) improving the prognosis of patients with schizophrenia and decreasing the risk of disease recurrence or readmission, (4) conducting comorbidity studies of schizophrenia and (5) providing guidance on risk stratification for serious comorbidity in patients with schizophrenia.
Cohort description
Study area
Ningbo is located in the southern part of China’s Yangtze River Delta, covering an area of approximately 9816 km2. It features a mild and humid subtropical monsoon climate with four distinct seasons. In 2024, the gross domestic product of Ningbo, with a total resident population of 9 777 000 individuals, reached ¥1.81 trillion.22 23 The city has 225 medical institutions, including 8 tertiary hospitals, 52 secondary hospitals and 165 primary healthcare institutions. Secondary and tertiary hospitals primarily deliver diagnostic services, outpatient care and inpatient treatment, while primary healthcare institutions, such as community health centres and township clinics, focus on continuity of follow-up care.
Ningbo was selected for this study for two primary reasons: (1) Ningbo offers a comprehensive range of mental health primary healthcare institutions with a diverse array of providers and a sufficient number of qualified personnel, as shown in figure 1. The city boasts nearly 2000 allied primary mental healthcare providers. The regional distribution of primary mental healthcare providers in Ningbo is presented in online supplemental SFigure 1. (2) Ningbo has a highly developed healthcare information platform. This includes the (a) NRHIP, which ranked sixth in the National Health Information Technology Development Index in 2021,24 with a capacity to store 35 billion pieces of residents’ medical information, including 1042 e-forms. It realises region-wide coverage, population-wide services and full life cycle management. The platform also includes (b) NMHIS, which records nearly 40 000 patients with schizophrenia of all ages, as illustrated in figure 1. The system interface specification issued by the Health Commission of Ningbo guarantees that the NMHIS and NRHIP are interconnected, forming a continuous, complete and multidimensional EHR of patients with schizophrenia and encompassing several indicators such as diagnosis, medication, laboratory tests and imaging studies (eg, echocardiography). This provides a high-quality source of data with regard to the construction of NSC.
Supplemental material
Regional distribution of patients with schizophrenia in Ningbo and the location of mental health service organisations.
Study design
NSC is a population-based, ambispective cohort study that includes individuals with schizophrenia in Ningbo. The follow-up for this cohort began on 1 January 2010, to bypass the initial trial period and NRHIP integration phase, as demonstrated in figure 2. Clinical coding in this study was conducted by using the WHO International Classification of Diseases 10th Edition (ICD-10).25 Once an individual enters the NSC, they are followed up continuously until death or relocation.
Lexis diagram of Ningbo schizophrenia cohort follow-up coverage time.
Patients were eligible if they were aged 18–65 years at diagnosis and had a diagnosis of schizophrenia confirmed by at least two independent visit records in the EHR. Additionally, they were required to provide signed informed consent, either personally or through their legal representatives, and possess complete EHR data, including date of birth, gender and a valid medical identifier. Subjects meeting any of the following criteria were not eligible to participate: (1) history of severe neurological diseases, including epilepsy, encephalitis or brain tumours or (2) history of severe head trauma resulting in unconsciousness, to exclude trauma-induced psychotic symptoms. The exclusion criteria with corresponding ICD-10 code details are provided in online supplemental S table 1.
Participants underwent a semi-structured interview to collect data on sociodemographic characteristics, comorbid mental disorders, previous psychiatric treatments, family history of psychiatric illness and childhood trauma at baseline. Schizophrenia diagnoses were confirmed using the Structured Clinical Interview for DSM-5 Disorders.26 Psychopathological assessment included the Positive and Negative Syndrome Scale and Brief Psychiatric Rating Scale to evaluate the severity of psychotic symptoms.27 28 The Clinical Global Impression was used to assess overall disease severity,29 while the Global Assessment Scale measured overall functional status.30 Additionally, the Social Disability Screening Schedule was used to assess social functioning deficits in individuals with psychiatric disorders.31 Risk assessment includes a history of violent, assaultive and aggressive behaviours, while a history of suicide attempts is recorded separately. If the patient is hospitalised for a long period, the Nurses’ Observation Scale for Inpatient Evaluation is used to assess their behaviour and social functioning in a psychiatric inpatient setting.32 Recovery is typically defined as an improvement that encompasses both the symptom and functional levels, sustained over an extended period, usually at least 2 years.33 34
Data sources
NSC data are primarily derived from the NMHIS, as patients with schizophrenia are enrolled in this system by clinicians during their first visit and assigned a unique identifier. This identifier links information across multiple forms in the NMHIS, such as the patient information, patient specific file, and follow-up forms. This linkage enables access to key information on mental health variables necessary for the NSC, such as sociodemographic data, pre-existing mental disorders, risk factors for schizophrenia, psychological status, and symptoms and severity.
Furthermore, the NMHIS is linked to the NRHIP. By 31 December 2023, 97.83% of patients with schizophrenia with unique identifiers had at least one electronic medical record in the NRHIP, indicating that they had received at least one clinical service at either a hospital or a primary healthcare institution. The NRHIP supplements this information with additional key variables, such as outpatient or inpatient records and medication. The NSC leverages the strengths of both the NRHIP and NMHIS platforms, using data from the full life cycle of a patient—from pre-onset to diagnosis, follow-up, death or recovery. This comprehensive approach facilitates effective full life cycle management of patients, as illustrated in figure 3.
Full life-cycle management of patients in the Ningbo Schizophrenia Cohort from pre-onset to death. *Variables were measured at hospitalisation. BRPS, Brief Psychiatric Rating Scale; CGI, Clinical Global Impression; GAS, Global Assessment Scale; NOSIE, Nurses’ Observation Scale for Inpatient Evaluation; PANSS, Positive and Negative Syndrome Scale; SDSS, Social Disability Screening Schedule.
Findings to date
A total of 26 899 patients with schizophrenia were included in the NSC. As of 31 December 2023, there were 4505 deaths in the NSC. The median age at diagnosis was 35 years, and the age at diagnosis of the non-survivors (median (IQR): 40 (29–51) years) was significantly higher than that of the survivors (median (IQR): 34 (26–45) years). The prevalence of the disease was high in females, while their years of education were generally low; the occupational distribution was dominated by farmers and those who were laid off or unemployed, accounting for 70.5%; and unmarried, widowed and divorced patients accounted for 39.5%. There were 1503 patients with a family history, accounting for 5.6% and the mode of onset of the disease was most acute. A detailed description is given in table 1.
Baseline characteristics of the Ningbo Schizophrenia Cohort
Long-term follow-up
The NSC follow-up adopts a combination of active and passive mechanisms through face-to-face and outpatient follow-ups. After patients sign the informed consent form, the NMHIS sends patients’ information to the primary healthcare institutions in their places of residence. The primary healthcare institutions then provide the patients with long-term follow-up and rehabilitation guidance. The initial face-to-face follow-up visit, during which informed consent is obtained, occurs within 10 working days. Each follow-up visit updates information on the measured factors, resulting in abundant longitudinal data. Meanwhile, the NRHIP will continue to collect information from patients’ EHR. The dynamic measurements of key variables during follow-up in the NSC group are presented in table 2.
Dynamic measurement of key variables for the Ningbo Schizophrenia Cohort follow-up
Data management
Quality control covers the entire data collection, processing and usage processes. At the data-entry stage, a module for collecting information on patients with schizophrenia is embedded in the hospital information system of the medical institution so that the subsequent step in diagnostic and treatment activities cannot be conducted without filling in the information. This ensures that each patient’s information is collected. The NMHIS is equipped with logical checking functions to ensure data validity, standardise data and reduce processing work, including data type conversion. During the data-processing phase, multiple identification codes are stored in the NSC, including the civil identification number, case number, outpatient number, inpatient number and unique identifier code used internally in the NMHIS. Links within the NMHIS forms are primarily through unique identifier codes, while links to other platforms are primarily through civil identification numbers. If the civil identification number is missing and cannot be linked, an approximate string match is conducted using other information, such as the date of birth, address, outpatient number and inpatient number. Wonder Information Corporation maintains and manages the data to ensure their integrity and availability. All personal information of NSC patients is anonymised during the data-utilisation period.
Data utilisation
Seeking new real-world evidence to infer the aetiology of schizophrenia
In cohort studies wherein exposure precedes and outcomes follow, the temporal sequence of causality between the exposure and outcome is an important criterion for determining causality. The NSC provides information on factors such as health service utilisation and environmental exposure across the patient’s full life cycle, including pre-onset; therefore, the multidimensional pre-onset environmental and health service utilisation information obtained by the NSC may provide new real-world evidence for aetiological inferences of schizophrenia.
Examining the comorbidity of schizophrenia to assist with managing comorbidities
The comorbidity rate of individuals with mental disorders exceeds 35%35 as a result of poor access to health services and the side effects of antipsychotic drugs. Single-disease management cannot minimise disruptive medicine, which wastes medical resources and is not conducive to patient rehabilitation; therefore, comorbidity management is imperative. The NSC provides the times of initial and comorbidity diagnoses. Therefore, the NSC may provide real-world data for the identification of comorbidity clusters and the analysis of comorbidity trajectories in patients with late-stage schizophrenia.
Constructing a predictive model for schizophrenia and subsequent CVD
CVD is a serious comorbidity of schizophrenia, and its high prevalence is the primary reason for the reduced life expectancy of these patients.36 However, existing CVD risk-prediction models, such as the Framingham Risk Score,37–39 are not suitable for risk prediction in patients with schizophrenia as they do not consider specific risk predictors such as anti-psychotic drug utilisation. The NSC contains information on predictors from the Prediction for Atherosclerotic CVD Risk in China40; WHO CVD risk charts41; and specific predictors such as anti-psychotic drug utilisation, MRI and cardiac auxiliary examination. Therefore, the NSC contains a database for developing predictive models for schizophrenia and subsequent CVD.
Identifying the best interventions available for patients with schizophrenia
Currently, the most important treatment for schizophrenia is the administration of antipsychotic medication, but medication may produce side effects and have little effect on negative symptoms.42 In this scenario, psychosocial interventions (eg, family interventions, family psychoeducation and cognitive-behavioural therapies) play an important role in the rehabilitation of schizophrenia patients43 and are an important means of supplementing medication. The applicable intervention methods vary due to differences in the mode of onset of the disease in patients and their economic status; therefore, we hope to identify the best intervention methods for patients with different types of schizophrenia by comparing the patient efficacy of the different methods using the data resources of the NSC.
Discussion
The NSC is an ambispective, observational, multicentre cohort study focused on patients with schizophrenia. This study covers the full life cycle of patients, accumulating rich epidemiological and clinical data. The NSC aims to provide a database for future research on schizophrenia, determine the burden of disease and risk factors, improve prognosis and reduce the risk of relapse. Notably, the median age of onset of schizophrenia is generally early adulthood,44 45 whereas in the NSC, the median age of diagnosis is 34 years, as the diagnostic process for schizophrenia in Ningbo may require up to 10 years or more. This phenomenon may be influenced by a variety of factors, including the inadequacy of social support systems and social and economic factors. This is in addition to the cultural perceptions, accessibility of healthcare resources and level of public awareness of mental health issues.
Strengths and limitations
The key strength of the NSC lies in its foundation as an EHR-based cohort, enabling data collection throughout patients’ full life cycles—from pre-onset and diagnosis to follow-up, recovery or death. This design provides a broad range of study factors and outcomes. Given that mental disorders are significant NCDs, the repeated quantitative characteristics obtained from follow-up allow for the assessment of treatment effects. The longitudinal data, with their temporal sequence, enable a comprehensive evaluation of risk factors for disease development and various factors related to prognosis, facilitating aetiological research on schizophrenia.
Additionally, the NSC collects diverse, multidimensional data (eg, laboratory tests and imaging studies) by accessing electronic medical records. This valuable resource assists in identifying whether functional abnormalities of other organs, pathological lesions or metabolic abnormalities are associated with schizophrenia. Therefore, numerous variables and outcomes can be extracted while conducting a real-world study.
Large sample size and broad coverage are another major advantage of the NSC. The NMHIS is embedded in the hospital information system of all psychiatric hospitals in Ningbo, and patients diagnosed with schizophrenia are included in the cohort after providing informed consent within the hospital setting. This makes this study less susceptible to selection bias, ensuring the accuracy and representativeness of the study cohort.
This study has some limitations. First, the NSC is a real-world research cohort with observational and non-randomised interventions, and the real-world data to be studied may not cover the entire study population. Furthermore, there is potential information bias in this study, which refers to systematic errors in data collection, recording or classification that could affect the accuracy of study findings.46 Schizophrenia diagnoses in this study were confirmed using the Structured Clinical Interview for DSM-5 Disorders, ensuring diagnostic accuracy. However, medical conditions obtained from the NRHIP were coded using ICD-10, which may introduce misclassification bias in comorbid conditions due to potential coding inaccuracies in routine clinical practice. This is an inherent limitation of using EHR data for research.17 ICD-10 codes can be combined with natural language processing to accurately identify patients.47 Third, perinatal factors and substance use have been identified as potential risk factors for schizophrenia48–50; however, due to privacy and ethical review constraints, such information was not included in this study. Future research could explore these aspects further by integrating maternal and child health databases, implementing drug screening or conducting long-term cohort studies. Finally, the exclusion of organic diseases is crucial in schizophrenia research. In this study, we excluded only conditions that would significantly impact schizophrenia diagnoses. To ensure data completeness, we implemented a data flagging strategy for patients newly diagnosed with neurological disorders (eg, Niemann-Pick disease, Huntington’s disease, Wilson’s disease, Dandy-Walker syndrome and Alzheimer’s disease) during follow-up. These cases are flagged in the NSC, allowing future researchers to determine their inclusion based on specific study objectives.
Data availability statement
Data may be obtained from a third party and are not publicly available. The NSC was developed by the Affiliated Kangning Hospital of Ningbo University, which was responsible for granting access to and using patient data. Access to data resources requires completion of a proposal that includes the proposed research programme and the primary source of funding. All research must be ethically approved, and the feasibility as well as compliance of the research programme with the data-access policy must be assessed. Interested researchers are encouraged to contact the principal investigator, Professor Liya Liu (liuliya@nbu.edu.cn).
Ethics statements
Patient consent for publication
Ethics approval
This study involves human participants and was approved by the Medical Ethics Committee of Ningbo University (NBU-2024-011) and registered at ClinicalTrials.gov (NCT06370793). All patients signed an informed consent form of their own accord by themselves or through their guardians prior to being included in the cohort. Access to information was based on a specially constructed server, which could not be accessed without approval, and confidentiality agreements were signed by the researchers prior to accessing the data.
Acknowledgments
The researchers thank all the patients who participated in the NSC.
References
Footnotes
Contributors He Ye contributed significantly to the study design and drafted the manuscript; Yang Zhao contributed to the study conceptualisation and design; Lian Li, Yi Qian and Hangjie Zhu contributed to data collection and collation; Guolin Bian and Liya Liu contributed significantly to the conceptualisation of the study, and both contributed equally to the guidance and supervision of this research. All authors revised the manuscript. Liya Liu serve as the guarantor of this study, ensuring the accuracy and integrity of the work.
Funding This study was funded by the National Natural Science Foundation of China (82373690, 81602940), National Key Research and Development Programme of China (2022YFC3702702), Natural Science Foundation of Zhejiang Province (ZCLY24H2601), One Health Interdisciplinary Research Project, Ningbo University, State Key Laboratory of Reproductive Medicine, Nanjing Medical University (SKLRMK202104), Key Laboratory of Modern Toxicology of Ministry of Education, Nanjing Medical University and Ningbo Medical & Health Leading Academic Discipline Project (2022-F28).
Map disclaimer The depiction of boundaries on this map does not imply the expression of any opinion whatsoever on the part of BMJ (or any member of its group) concerning the legal status of any country, territory, jurisdiction or area or of its authorities. This map is provided without any warranty of any kind, either express or implied.
Competing interests None declared.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.