Article Text
Abstract
Introduction Non-response significantly undermines the representativeness of patient-reported outcome (PRO) data, thereby compromising its utility for facilitating high-value, equitable, patient-centred care in cancer clinics. Quality improvement studies are needed to assess the representativeness of PRO data collected in routine care, identify the underlying causes of non-response and develop novel methods to ensure data representativeness. Using a multilevel framework and a mixed-methods approach, we have three aims: (1) characterise the non-response of the Global-10 across clinic, provider and patient levels; (2) identify multilevel causes of non-response and potential strategies to improve representativeness in PRO collection; and (3) develop effective modifications to missing-data methods to enhance the representativeness of pre-existing PRO data.
Methods and analysis Our primary data source is the Patient Reported Outcomes Measurement Information System Global-10, collected as part of routine care at the Radiation Oncology clinics within the Mass General Brigham (MGB) healthcare system. Other sources include (1) Harvard Catalyst for provider-specific data, (2) MGB administrative data, (3) public Centers for Medicare & Medicaid Services data and (4) the National Plan and Provider Enumeration System. We will conduct quantitative analyses to assess variations in Global-10 non-response across multilevel factors. Additionally, we will use qualitative interviews with patients and clinical professionals to understand the causes of non-response and to formulate strategies to expand the reach of PRO collection to underrepresented cancer patients, improve their completions and enhance overall data representativeness. Finally, we will integrate implementation science knowledge and findings from the first two aims into missing-data methods to manage non-response in the pre-existing Global-10 data and to evaluate their performance in preserving representativeness.
Ethics and dissemination The study protocol was reviewed and approved by the Institutional Review Board at the Dana-Farber/Harvard Cancer Center (24-225). Written informed consent will be obtained from participants. Study findings will be disseminated through peer-reviewed publications and presentations at national and international conferences.
- Patient Reported Outcome Measures
- Radiation oncology
- Implementation Science
- Surveys and Questionnaires
- STATISTICS & RESEARCH METHODS
- QUALITATIVE RESEARCH
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
- Patient Reported Outcome Measures
- Radiation oncology
- Implementation Science
- Surveys and Questionnaires
- STATISTICS & RESEARCH METHODS
- QUALITATIVE RESEARCH
STRENGTHS AND LIMITATIONS OF THIS STUDY
This project will dissect non-response using a multilevel framework which has been widely applied in dissemination and implementation science but not yet commonly applied to non-response research.
We will examine patient-reported outcome (PRO) data, collected as part of routine care at radiation oncology clinics within Mass General Brigham healthcare system in Massachusetts, USA.
Our project introduces methodological innovation by integrating implementation knowledge from PRO data collection experiences into missing-data methods.
This is a cross-sectional study, so causal relationships cannot be determined.
Introduction
Patient-reported outcome (PRO) measures, or PROMs, are powerful tools in cancer care to enhance clinician-patient communication, identify problematic symptoms and treatment priorities, facilitate shared decision-making and prolong survival.1 2 Aggregated PRO data can also generate patient-centred real-world evidence to inform clinical practice and are increasingly incorporated into national public reporting and value-based healthcare initiatives as measures of care quality.3 4 Regardless of their purpose, PRO data can exist only when patients participate, and non-response emerges as a significant challenge to data representativeness.
Compared with clinical trials, collecting PROMs in clinical care involves large-scale implementation and is susceptible to significant heterogeneity due to the absence of standardisation, limited resources, variable leadership buy-in and diverse patient populations, resulting in heterogeneity in the collected data.5 6 Furthermore, studies have shown that minoritised groups, such as racial/ethnic minorities and non-English speakers, are less likely to complete PROMs.7–12 Using PRO data that do not represent the patient population to generate evidence or evaluate care delivery threatens the validity of such efforts and can worsen healthcare disparities.13–16 For PROMs to continue to promote high-quality and equitable patient-centred cancer care, it is crucial to identify the underlying causes of non-response and to develop novel methods to ensure data representativeness.
Research into the implementation of PROMs in clinical care has revealed many patient-, provider- and clinic-level factors that can affect PROM completion rates, such as provider engagement and infrastructure support.17–21 Although identifying the barriers and facilitators to PROMs collection is important, higher collection rates do not necessarily imply better data representativeness. Strategies aimed at increasing collection may differ from those designed to enhance representativeness. Yet, most current research focuses on methods to increase collection instead of methods to enhance PRO data representativeness. Multilevel frameworks can help to not only characterise non-response but to also identify determinants that can improve the representativeness of routinely collected PRO data.22–25 These structured frameworks can simultaneously be used to develop mitigation strategies to target the identified determinants, thereby addressing PRO data representativeness efficiently and effectively.22–25
The overarching goal of this project is to delineate potential causes of non-response and to devise mitigation strategies to improve the representativeness of PRO data collected as part of routine radiation oncology care. To achieve this, we will examine Patient Reported Outcomes Measurement Information System (PROMIS) Global-10 (Global-10) data, collected as part of routine care at the radiation oncology clinics within the Mass General Brigham (MGB) healthcare system, a large, integrated health system in Massachusetts. Using a multilevel framework and a mixed-methods approach, this project aims to:
Aim 1: characterise the non-response of the Global-10, collected as part of routine care, across clinic, provider, and patient levels.
Aim 2: identify multilevel causes of non-response and potential strategies to improve representativeness in PROMs collection.
Aim 3: develop effective modifications to missing-data methods to enhance the representativeness of pre-existing PRO data.
Methods and analysis
Project overview
The approach consists of a multilevel framework derived from the Consolidated Framework for Implementation Research (CFIR) and a mixed-methods strategy. Figure 1 displays the project overview. In aim 1, we will assess variations in non-response of Global-10 across multilevel factors to identify key areas of concern and organise our findings. In aim 2, we will use qualitative interviews to understand the causes of non-response across the key areas identified from aim 1 and to formulate strategies that can expand the reach of PROMs collection to underrepresented cancer patients, improve their PROMs completions and enhance overall PRO data representativeness. In aim 3, we will incorporate implementation science knowledge and findings from the first two aims into missing-data methods to manage non-response of the pre-existing Global-10 data and to evaluate their performance in preserving representativeness. In the final stages of aim 3, we will compare the results yielded from different missing-data methods. This comparison will not only validate the robustness of our findings but also highlight the most effective modifications to missing-data methods to protect representativeness in PRO data with non-response.
Project overview. PRO, patient-reported outcome; PROMs, PRO measures.
Conceptual framework
To ensure thorough investigations into non-response and to facilitate the development of actionable mitigation strategies, this project draws from the CFIR, which has been used to guide the systematic evaluation of determinants in the design and implementation of PROMs collection in clinical care.23 The CFIR encompasses five major domains: intervention characteristics, outer setting (eg, patient needs, external policies), inner setting (eg, institute characteristics, implementation climate and accessible information and technology), characteristics of the individuals involved and implementation process.23–25 Our project focuses on the inner setting and characteristics of the individuals involved (ie, patients and providers) domains. We perceive the inner setting to include factors at the clinic level, such as facility characteristics and implementation climate, which are associated with leadership engagement, available resources and accessible knowledge essential for PROMs collection in diverse populations and problem-solving.20 21 Providers’ clinical workload and annual PROMs collection affect their willingness and ability to collect PROMs, address non-response and reach minority groups.19 21 Patients’ demographic characteristics, socioeconomic status and health status influence their motivation to complete PROMs.7–11 The multilevel framework will guide the design of quantitative analyses (aims 1 and 3) to uncover the landscape of non-response and aid in the development of qualitative interview guides (aim 2) to solicit knowledge on improving representativeness in data collection across clinic, provider and patient levels.
Study setting
We will use Global-10 data collected as part of routine care in MGB radiation oncology clinics. MGB is an integrated health system comprised of two academic medical centres as well as eight community hospitals and a large network of community-based physician office practices. In 2012, MGB initiated a standardised PROMs collection programme for routine clinical specialty care, offering participating physician practices and clinics the necessary technical infrastructure, personnel training, continuous education and operational resources for successful implementation.20 26 27 To date, the MGB PROMs programme accrues more than 6 million PROMs per year across more than 475 clinics from more than 80 medical, behavioural health and surgical specialties, representing the largest repository of multispecialty PRO data in the USA.
The MGB radiation oncology practices uniformly implemented PROMs collection across all clinic sites and remain active today. Because radiation therapy is a fundamental component of cancer care across nearly all cancer types for curative or palliative purposes, radiation oncology serves as a prime specialty from which to conduct our analyses to yield generalisable results applicable across the cancer care continuum. Furthermore, MGB radiation oncology clinics are in different settings (eg, academic vs community) and geographical locations and serve a diverse patient population, allowing for better capture of variations in sex, race/ethnicity and socioeconomic status.
Data sources
Multiple data sources will be merged to conduct this study (table 1).28–30 Our primary data source is MGB enterprise data warehouse (EDW), a system-wide data resource that contains harmonised medical, billing, claims and financial data, including PROMs data.
Proposed multilevel factors for the domains of inner setting and individual characteristics
Aim 1: characterise the non-response of the Global-10 across clinic, provider and patient levels
Study design and data analysis
Preliminary data show Global-10 completions from a total of 34 516 patients with 68 933 submissions representing 70% of the individuals who were assigned the Global-10 in the MGB radiation oncology clinics since the start of data collection in January 2015, with the lowest rate during the COVID-19 pandemic. Given that all patients are assigned the Global-10 as part of their initial visit to radiation oncology clinics, we will conduct a cross-sectional study focusing on the characterisation of Global-10 non-response. Additionally, due to the probable differences in non-response mechanisms between children and adults, this proposal will focus only on adult patients (those aged 18 years or older). In the MGB radiation oncology clinics, patients must complete all assigned items (ie, 10 items on Global-10) to finalise their submissions and obtain the summary scores (ie, global physical and mental health).31 Therefore, we will classify the completion to the assigned Global-10 as response and non-response. Response is defined as patients who completed all 10 items on Global-10. Non-response includes patients who missed at least one of the 10 items on Global-10. Within each clinic, we will calculate the overall frequencies and percentages for respondents and non-respondents, respectively. Furthermore, we will describe the distributions of respondents and non-respondents by factors at patient and provider levels for the overall sample and by each clinic, with a specific focus on underrepresented populations such as racial/ethnic minorities, older age, less-educated individuals and non-English speakers (table 1).
We will identify the predictors of non-response. Our primary hypothesis is that there are significant relationships between non-response and patients’ demographic characteristics (eg, being a racial/ethnic minority, being of an older age, being a non-English speaker and having a low income), providers’ data collection proficiency and the history of the PROMs programme in the clinics.7–11 18–21 Using bivariate analyses, we will assess the differences in distributions among respondents and non-respondents across multilevel factors. Factors with p values less than 0.20 will enter the multivariable model. For factors outside this threshold, we will refer to existing literature to determine their appropriateness for inclusion in multivariable analyses. We will use multinomial logistic regression and stepwise selection to identify significant predictors of non-response, adjusting for the random effects of clinics and providers.
Summary
By analysing Global-10 data from the MGB radiation oncology clinics, we will characterise the heterogeneity in non-response of routinely collected PRO data across clinic-, provider- and patient-level factors. This will not only facilitate our understanding of critical areas of concern but also guide our research in developing strategies to improve representativeness for both PROMs collection and PRO data analysis.
Aim 2: identify multilevel causes of non-response and potential strategies to improve representativeness in patient-reported outcome measures (PROMs) collection
Study design and semi-structured interview guides
We will conduct two separate focus groups with clinical professionals (ie, clinic leaders and providers) and approximately twenty-five 1:1 in-depth interviews with cancer patients in radiation oncology clinics. Clinic leaders and providers will be purposively recruited. Eligible clinic leaders include chief-level executives, administrative leaders, PROMs directors and clinician leaders.18 Provider participants must have been actively involved in PROMs collection in the past year. Regarding patients, we will recruit adults aged 18 years or older who are receiving care. We will employ quota sampling to ensure the inclusion of sex, race, ethnicity and other minority groups. Patients will be recruited with the help of radiation oncology providers or through electronic communication. We will exclude patients with documented cognitive impairment or those unable to provide informed consent.
Building on the findings from aim 1 and in accordance with our multilevel framework (figure 2 and table 1), we will develop semistructured interview guides tailored for clinical professionals (ie, clinic leadership and providers) and patients. For clinical professionals, we will explore (1) their perceived issues and causes of non-response and (2) potential strategies to enhance representativeness in data collection. For patients, we will solicit their insights regarding (1) PROMs collection overall, (2) the main challenges they face when deciding whether to complete PROMs and (3) potential solutions from their perspectives (eg, technology, rationale, etc). Both interview guides will be pilot tested with three clinical professionals and three patients before their use.
Multilevel framework of PROMs non-response. PROMs, patient-reported outcome measures.
Data collection and analysis
For clinical professionals, we will collect demographic characteristics, such as age, sex, job titles, years of experience in their respective fields and history of PROMs collection. We will conduct focus groups separately to foster open discussion and gather a rich variety of perspectives. We plan to recruit seven clinic leaders and 10 providers, specifically from clinics exhibiting the lowest and highest rates of non-response. We believe that two focus group interviews, with at least five clinical professionals in each, will achieve information saturation.32
For cancer patients, we will collect demographic characteristics and medical information (eg, cancer stage, treatment type and comorbidities) through the MGB EHR and pre-interview surveys as needed. The research team will conduct 1:1 in-depth interviews with cancer patients to understand their experiences and insights. We estimate that each interview will last 45–60 min, and approximately 25 patient interviews will be sufficient to reach thematic saturation.33 A $50 gift card will be provided to each patient for remuneration.
Quantitative data will be summarised descriptively. All interviews will be professionally transcribed. The research team will analyse qualitative data using the content analysis approach, independently and in conference to facilitate rigour. Discrepancies in coding between team members will be discussed and arbitrated by a third party as necessary to reach consensus. Deductive coding will be used to map the themes to the components of our multilevel framework (figure 2 and table 1) and strategies addressing the causes of non-response across clinic, provider and patient levels. Inductive coding will be used for new themes. We will use NVivo software for data analysis and management and follow consolidated criteria for reporting qualitative research to draft results.
Summary
The qualitative study in aim 2 will deepen our understanding of the issues and causes of non-response from both clinical professionals’ and patients’ perspectives to uncover mitigation strategies. Importantly, these strategies can effectively address the issue of ‘missing not at random (MNAR)’.34 The MNAR mechanism is the most difficult to address as it assumes that non-response is related to both observed (eg, patients’ characteristics) and unobserved factors.34 Unlike other types of missing-data mechanisms which will be evaluated in aim 3, non-response operated under MNAR cannot be fixed with post hoc statistical manoeuvres, thus uncovering its potential causes during data collection will provide valuable information.
Aim 3: develop effective modifications to missing-data methods to enhance the representativeness of pre-existing patient-reported outcome (PRO) data
Aside from MNAR, the most probable mechanism for non-response of PRO data collected in routine cancer care is ‘missing at random (MAR)’, given the established relationships between non-response and the demographic and clinical characteristics recorded in patients’ medical records. MAR assumes that non-response is related to observed variables.34 Advanced missing-data methods, such as Hot-deck imputation, multiple imputation (MI) and inverse probability weighting (IPW), are recommended for addressing MAR-based non-response.34–37 However, these methods can sometimes result in extreme values or unbalanced weights, making it challenging to fully establish representative data for analysis.38–41 We will seek to improve on these approaches by incorporating multilevel factors based on our multilevel framework. A notable feature of Hot-deck imputation and MI is their ability to incorporate auxiliary variables—those related to outcomes or non-response but not part of the main analyses—to enhance their performance, making them well-suited to modification based on multilevel factors.35–37 42 IPW adjusts for non-response bias using complete cases without imputation.40 Another mechanism for non-response is ‘missing completely at random (MCAR)’, which assumes that non-response is unrelated to any variables.34 MCAR is an unrealistic assumption in routinely collected PRO data as we know non-response bias exists. Therefore, in aim 3, we will explore the solutions to MAR that incorporate the multilevel factors inherently influencing PROMs data collection.
Study design and data analysis
We will integrate determinants of non-response, derived from (1) our multilevel framework and (2) findings from the first two aims, into the missing data analysis. The goal is to modify missing-data methods to improve their performance in preserving the representativeness of routinely collected PRO data. We will employ six missing-data methods to address the non-response of pre-existing Global-10 data (table 2). Under the MCAR assumption, listwise deletion will be used as a standard approach for comparison, while Hot-deck imputation, MI and IPW will be used for MAR. Specifically, when applying Hot-deck imputation, MI and IPW, multilevel factors of non-response will be integrated into the analyses. This integration enables us to investigate whether the performance of these methods in achieving representativeness can be enhanced. Notably, although Hot-deck imputation, MI and IPW are primarily recommended for non-response under the MAR assumption, they also effectively address non-response under the MCAR assumption, and thus we will examine both assumptions in this proposal for completeness.34 35
Missing-data mechanisms and proposed methods
Performance evaluation
We aim to rigorously assess the performance of various missing-data methods to enhance the representativeness of Global-10 data, in both scenarios where information from successful PROMs collection is available and where it is absent. Our assessment will comprise a comprehensive comparative analysis focusing on (1) demographic characteristics, (2) summary scores and (3) predictive validity between the complete cases and the overall sample after adjustment for each missing data technique.31 43–45 For demographic characteristics, we will compare the distributions of sex, age, race, ethnicity, language, education, employment status and financial insurance type. Our goal is to determine whether the sample post-adjustment reflects the demographic distributions observed in the overall population who have been assigned the Global-10.43 45 For physical and mental health summary scores, we will describe and compare their distributions (eg, mean, SD, minimum and maximum) between complete cases and the post-adjustment sample. Favourable missing-data methods should produce summary scores with fewer outliers, smaller SD and lower mean values, especially since we account for the non-response in minority groups and cancer patients with advanced diseases and poor health status.44 46 47 Regarding the predictive validity, we will use logistic regression to evaluate the ability of physical and mental health summary scores to predict healthcare utilisation (eg, urgent care visits, hospitalisations, etc) or death.44 Our hypothesis is that after accounting for the non-response, the adjusted summary scores will have stronger correlations with healthcare utilisation or death than the scores from the complete cases.
Summary
By developing and examining modifications to statistical methods for handling missing data, aim 3 can identify effective strategies to address non-response and increase representativeness in the analysis of pre-existing, routinely collected PRO data. Given the large volume of such data and significant heterogeneity in its quality, the findings from aim 3 can guide the selection of appropriate statistical approaches and key determinants of non-response to improve representativeness to the extent possible when utilising pre-existing PRO data for quality improvement and patient-centred cancer care.
Patient and public involvement
None.
Ethics and dissemination
This study has been approved by the Dana-Farber/Harvard Cancer Center (DF/HCC) IRB (24-225). The research team will take all necessary steps possible to protect participants from the few minimal risks potentially associated with the study. All patient identifiers will be removed prior to the analysis. Throughout all project activities, the study team members will adhere to all MGB policies, standards and procedures, as well as any data use agreements related to specific data sources. We will continue to protect confidentiality and prevent inappropriate access, use or disclosure of data.
In the qualitative studies, potential participants will be informed about the project, and the research staff will confirm their eligibility, provide additional study details and answer any questions they may have. If a potential participant is willing to enrol, the study staff will obtain their written informed consent. Focus group interviews with clinical professionals will be conducted via a secure video-conferencing platform. In-depth interviews with patients will be conducted in a private room or conducted remotely using a secure video-conferencing platform or telephone, according to participant preference. All audio/video recordings, transcripts, surveys and demographic forms for qualitative studies will be stored on secure, encrypted servers on password-protected computers, accessible only by the study research staff. The qualitative data will be collected using password-protected digital recorders. During transcription, all identifying information that could be used to link the data with the participant will be de-identified. Audio files will be destroyed once transcribed, and no personal identifiers will be linked to the transcripts. Study participants will not be identified in any reports, presentations or publications resulting from this study. All quantitative and qualitative data will be stored on encrypted study computers, and all analyses will be conducted at Brigham and Women’s Hospital, where the study database will be hosted.
Study findings will be disseminated through peer-reviewed publications and presentations at national and international conferences.
This is a 2-year study funded by the National Cancer Institute of the National Institutes of Health. The study team is currently preparing for data extraction and curation from the MGB EDW and other sources. We plan to dedicate 6 months to aim 1, 9 months to aim 2 and 6 months to aim 3. The final 3 months of the project will focus on manuscript development. The anticipated date of completion is July 2026.
Discussion
Our project is innovative in at least four ways. First, we will dissect non-response using a multilevel framework which has been widely applied in dissemination and implementation science but not yet commonly applied to non-response research. Second, the wealth of data available to us through MGB radiation oncology clinics, encompassing clinic-, provider- and patient-level factors related to non-response, is novel in itself. Third, with no established guidelines for reporting and analysing routinely collected PRO data with non-response, our project pioneers the characterisation of PROMs completion by varied degrees of non-response across clinic-, provider- and patient-level factors. This approach can disentangle the intricate landscape of non-response in cancer clinics and facilitate the assessment of relevant causes. Such insights have the potential to significantly impact future non-response mitigation during PROMs collection as well as robustly address its statistical management in the analysis of pre-existing, routinely collected PRO data. Lastly, our project introduces methodological innovation by integrating implementation knowledge from PROMs collection experiences into missing-data methods. This integration aims to scrutinise the effectiveness of these methods in preserving the representativeness of routinely collected PRO data despite non-response, setting a precedent in methodological innovation.
There are several potential obstacles and limitations that warrant discussion. First, recruiting clinical professionals for qualitative interviews may pose a challenge due to their demanding schedules. To address this, we will liberally conduct focus group interviews via secure teleconferencing platforms outside of normal business hours. Second, recruiting underrepresented cancer patients may also be challenging, with some minority groups potentially reluctant to participate. If this occurs, we will increase the incentives. Third, there might be a lack of direct data sources for some aggregated level factors in our multilevel framework. In such cases, we will collect information through MGB administrative or human resources data or direct contact with clinical professionals. Fourth, our currently developed multilevel framework may not cover all potential factors. We will continue to add additional variables based on available databases and current evidence as the project ensues. Finally, as this is a cross-sectional study, causal relationships cannot be determined.
The completion of this project will elucidate the characteristics of non-response and its intricate associations with multilevel factors of successful, large-scale PROMs collection in diverse patient populations. Insights from this endeavour will guide the evolution and development of PROMs collection programmes to expand their reach to underrepresented cancer patients and improve data representativeness. This, in turn, enables the utility of PRO data for quality improvement and high-quality, equitable patient–centred cancer care.
Ethics statements
Patient consent for publication
References
Footnotes
Contributors CZ is the guarantor. CZ and JBL are the principal investigators of this project and led the study design. CZ, JBL and MOE contributed to the conception and design of the study. CZ led the writing of this protocol manuscript. JBL, MOE, ALP and NEM contributed significantly to the editing of this manuscript. All authors contributed to the editing and final approval of the protocol.
Funding CZ, JBL and MOE were supported by the National Cancer Institute of the National Institutes of Health under Award Number R03CA292987. The funders had no role in the conceptualization, analysis or presentation of findings of this study.
Competing interests Outside of the submitted work, ALP is the co-developer of the Q-Portfolio measures and receives royalties when used for commercial purposes. ALP, CZ and MOE are supported in part by the National Cancer Institute for unrelated work. All other authors have nothing to disclose.
Patient and public involvement Patients and/or the public were not involved in the design, conduct, reporting or dissemination plans of this research.
Provenance and peer review Not commissioned; peer reviewed for ethical and funding approval prior to submission.