Overlapping research efforts in a global pandemic: a rapid systematic review of COVID-19-related individual participant data meta-analyses

Background Individual participant data meta-analyses (IPD-MAs), which involve harmonising and analysing participant-level data from related studies, provide several advantages over aggregate data meta-analyses, which pool study-level findings. IPD-MAs are especially important for building and evaluating diagnostic and prognostic models, making them an important tool for informing the research and public health responses to COVID-19. Methods We conducted a rapid systematic review of protocols and publications from planned, ongoing, or completed COVID-19-related IPD-MAs to identify areas of overlap and maximise data request and harmonisation efforts. We searched four databases using a combination of text and MeSH terms. Two independent reviewers determined eligibility at the title-abstract and full-text stages. Data were extracted by one reviewer into a pretested data extraction form and subsequently reviewed by a second reviewer. Data were analysed using a narrative synthesis approach. A formal risk of bias assessment was not conducted. Results We identified 31 COVID-19-related IPD-MAs, including five living IPD-MAs and ten IPD-MAs that limited their inference to published data (e.g., case reports). We found overlap in study designs, populations, exposures, and outcomes of interest. For example, 26 IPD-MAs included RCTs; 17 IPD-MAs were limited to hospitalised patients. Sixteen IPD-MAs focused on evaluating medical treatments, including six IPD-MAs for antivirals, four on antibodies, and two that evaluated convalescent plasma. Conclusions Collaboration across related IPD-MAs can leverage limited resources and expertise by expediting the creation of cross-study participant-level data datasets, which can, in turn, fast-track evidence synthesis for the improved diagnosis and treatment of COVID-19. Trial registration 10.17605/OSF.IO/93GF2. Supplementary Information The online version contains supplementary material available at 10.1186/s12913-023-09726-8.


Background
The harmonisation and analysis of participant-level data and metadata for cross-study analyses, including individual participant data meta-analyses (IPD-MAs), can inform COVID-19 response through improved evaluation of diagnostic, preventative, and treatment measures.IPD-MAs have several analytic benefits over standard aggregate data meta-analyses when considering analyses of longitudinal data and the development and validation of clinical risk prediction tools [1][2][3].IPD-MAs allow for joint consideration of study and subject-level heterogeneity to separate clinically relevant heterogeneity from heterogeneity related to study design or exposure and outcome ascertainment [1][2][3].Separating clinically relevant from spurious heterogeneity is central to understanding whether observed differences in the risk of long COVID and COVID-19-related mortality are due to actual differences in exposure or immune response or to study-level differences in selection, ascertainment, or residual confounding.
The implementation and management of IPD-MAs are resource-intensive [1,2,4].Collecting the well-characterised metadata needed to appropriately describe included studies and cleaning and harmonising participant-level data from related studies require a significant investment of time and expertise from the primary studies and the IPD-MA management team [2,5].Additional barriers to sharing participant-level health-related data [1], including fears of lost opportunities for publication and legal or ethical considerations, can prevent or slow down data sharing [6][7][8].IPD-MAs are essential for informing research design, risk communication, and clinical practice for COVID-19.Given the significant resources needed to undertake an IPD-MA, identifying areas of overlap in exposures and outcomes of interest and inclusion criteria can foster cross-IPD-MA coordination to avoid duplication and maximise the utility of existing data.
Our research aim was to identify areas of overlap in research aims and study populations, and to identify included studies across planned, ongoing, or completed COVID-19-related IPD-MAs.We conducted a rapid systematic review to identify and describe synergies across COVID-19 IPD-MAs with a focus on study inclusion and exclusion criteria, study populations and designs, and exposure and outcomes of interest.Our working hypothesis was that there would be several areas of overlap across planned, ongoing, or completed COVID-19-related IPD-MAs.When identified early in the IPD-MA process, we expected that researchers could then exploit these cross-IPD-MA synergies to rapidly and efficiently conduct IPD-MA studies during the ongoing COVID-19 pandemic.

Methods
We conducted a systematic search of four databases and protocol repositories, including Ovid Medline, the PROSPERO International Prospective Register of Systematic Reviews, the Open Science Foundation (OSF), and the Cochrane Database of Systematic Reviews, using a combination of MeSH (where applicable) and text terms (Additional file 1).We ran the searches on 2 June 2021, 29 October 2021, and 7 February 2022.The protocol for this systematic review was developed per the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)-Protocol statement guidelines [9,10].Before implementing the searches, we uploaded the systematic review protocol and search strategies to OSF (10.17605/OSF.IO/93GF2) after unsuccessfully trying to upload the protocol to the PROSPERO Registry of Systematic Reviews, which told our team that the systematic review of IPD-MAs was not a systematic review.This systematic review is reported per the 2020 PRISMA statement (Additional file 2) [11].

Study selection and data extraction
Eligible protocols or published studies were IPD-MAs that planned to include or included participant-level COVID-19-related health data.IPD-MAs that only included social or psychological measures and systematic reviews limited to aggregate measures rather than participant-level data from included studies were excluded.Two independent reviewers determined eligibility at the title abstract and full-text screening stages.One reviewer extracted data into a pre-piloted data extraction Google sheet.Data were subsequently reviewed by a second reviewer.Differences of opinion and discrepancies in data extraction were resolved through consensus.

Analysis
We conducted a narrative synthesis of the results and summarise findings in a series of Sankey diagrams created in RStudio version 1.4.1103.We did not include a formal risk of bias assessment as part of this rapid systematic review, as most IPD-MAs only had a protocol available for review at the time of data extraction.

Patient and public involvement
Patients and the public were not directly involved in this systematic review; we used publicly available data for the analysis.

Results
We reviewed 116 full texts and identified 31 COVID-19-focused health-related IPD-MAs (see Additional file 3 for the PRISMA flow diagram).The majority of IPD-MAs were identified through PROSPERO (n = 21), followed by Ovid Medline (n = 8) and OSF (n = 2) [12,13].No IPD-MAs were identified from the Cochrane Database of Systematic Reviews.The 31 ongoing or completed COVID-19 IPD-MAs are described in Table 1.As shown in the Sankey diagrams in Fig. 1A-D, there were several areas of overlap in included study populations, designs, interventions, and outcomes of interest between ongoing or completed and static or living COVID-19-related IPD-MAs. Figure 1C-D limit inference to the 21 IPD-MAs that requested data from authors, which requires more effort than IPD-MAs of data included in publications.

Study designs
Ten IPD-MAs included randomised controlled trials (RCTs), non-randomised intervention studies, or longitudinal observational studies; an additional 10 IPD-MAs were limited to RCTs only.Three IPD-MAs included RCTs and longitudinal or cross-sectional observational studies [22,24,32].Two IPD-MAs had case reports and case series [27,33].One IPD-MA each was limited to case reports [15], medical records [12], and case series and longitudinal studies [35].One IPD-MA included any study design [17]; two others included any study design other than case reports [18,21].

Populations
More than half of the 31 IPD-MAs were conducted with data from hospitalised or intensive care unit (ICU) patients (n = 17).Ten IPD-MAs included data from the general population, and two IPD-MAs were limited to children or adolescents [14,21].One IPD-MA was conducted with pregnant women [37] and one with older adults and health care workers [42].Most IPD-MAs were not limited by geography (n = 28).One IPD-MA was limited to studies in the US and Canada [38], another to the US, Europe, and China [29], and one to China [19].

Availability of data from IPD-MAs
Fifteen IPD-MAs were published when we submitted the manuscript for publication.Three published IPD-MAs made their data available through GitHub (n = 1) [24] or the journal supplement (n = 2) [17,21].Two published IPD-MAs stated that interested researchers could request the dataset from the study team [29,35], and five said that data would not be made available [15,26,27,31,34].Five others did not include a statement related to data availability [33,36,[39][40][41].Three of the living IPD-MAs were published [29,31,41], although only one indicated that data could be requested from the study team [29].

Discussion
IPD-MAs are an essential tool for the rapid evidence generation needed to inform clinical practice, making them a vital part of the research response to emerging pathogens [43].We conducted a rapid systematic review to identify ongoing or completed COVID-19-related IPD-MAs.There were many areas of overlap in the 31 COVID-19-related IPD-MAs, including in study design and population, exposure, and outcomes of interest.In particular, the 14 IPD-MAs that evaluated the same medical exposures (antivirals, antibodies, ACEIs and ARBs, and convalescent plasma) represent a missed opportunity to exploit synergies.Most IPD-MA protocols were registered on PROSPERO, which could flag these areas of overlap when researchers submit their protocol.IPD-MAs require a significant investment of time and expertise, both from the team conducting the IPD-MA and the groups contributing data to the IPD-MA.Rapidly identifying and exploiting shared inclusion criteria can help facilitate evidence generation and avoid unnecessary duplication of effort.We identified at least 10 IPD-MAs that limited their analysis to data included in published reports.While IPD-MAs that are limited to published IPD have been conducted previously, the volume of the research response to COVID-19 coupled with the push for reproducibility and transparency have likely facilitated the rise in IPD-MAs of data that were included in the study publications.Almost half of the IPD-MAs of published data included case study or case series data (n = 4/10; 40%) [15,27,33,35].Given that the utility of the IPD-MA is limited by the quality of the studies that contribute data [2], findings from these rapidly produced IPD-MAs should be considered preliminary and updated when more detailed and less selective participant-level datasets become available.This finding is in keeping with a methodological review of published data that compared the methodological and reporting quality of COVID-19 and non-pandemic research and found a reduction in quality in the former [44].
While we reviewed the protocols for all IPD-MAs, we could only identify the restriction to published IPD for those IPD-MAs that had published their analyses, which suggests a need to clarify inclusion criteria in IPD-MA protocols to specify the intent to limit inference to published IPD.Some of the unpublished studies identified in our review may be misclassified as having the classical approach to conducting an IPD-MA, which includes the challenges associated with requesting the data from the data producers.
Living IPD-MAs are regularly updated as more evidence becomes available, representing substantial investments.There was overlap in study design, exposure, and outcome measurements in several of the five living IPD-MAs and between the living IPD-MAs and static IPD-MAs, which represents an opportunity to share limited resources and expedite findings.
Only a few IPD-MAs of data received from authors had been published when this manuscript was submitted for publication (n = 5/21; 24%), so we could not quantify the overlap in datasets across IPD-MAs that collected datasets from research teams which would be an important measure of cross-IPD-MA redundancy in efforts.Only three of the ten published IPD-MAs had made data available through a repository or the publication of supplementary materials [17,21,24], which suggests a continued need to encourage data sharing.
Working collaboratively to harmonise and share data across related IPD-MAs would maximise limited resources and shorten the timeline to deliver results that best inform clinical and public health practice.Testing the same hypotheses, especially with the same study designs or populations, represents a missed opportunity to evaluate novel hypotheses.Our findings support similar calls from a living review of COVID-19-related clinical trials and a scoping review of COVID-19-related data sharing platforms, which urged coordination across initiatives to reduce redundancies [45].We propose the creation of a task force to identify concrete steps to enable cross-initiative collaboration and ensure that the harmonised participant-level data and study-related metadata correspond to the findable, accessible, interoperable, reusable (FAIR) principles for data resources [46].These steps could include a cross-platform algorithm that uses natural language processing to alert researchers to similar initiatives during protocol deposition.The pandemic's global scope and rapidly evolving nature underscore the need for more meta-collaborations to bring together data-sharing efforts and cross-national analyses.The coordination of ongoing or planned IPD-MAs is a good starting place.

Conclusions
IPD-MAs are important for informed research and public health response to COVID-19.To identify areas of overlap, we conducted a rapid systematic review of completed or ongoing COVID-19 IPD-MAs.We identified 31 COVID-19-related IPD-MAs, including five living IPD-MAs, and found several areas of overlap in study designs, populations, exposures, and outcomes of interest.This review shows several potential areas of collaboration across related IPD-MAs which can leverage limited resources and expertise by expediting the creation of cross-study participant-level datasets.This, in turn, can fast-track evidence synthesis for the improved diagnosis and treatment of COVID-19.

Fig. 1
Fig.1 Sankey diagrams showing overlap between ongoing or completed and static or living COVID-19 IPD-MAs.A Shows overlap between the focus, included study designs, and type of IPD-MA for all the ongoing or completed IPD-MAs.B Shows overlap between the included study population, interventions/exposures, and outcomes of all the ongoing or completed IPD-MAs.C Shows overlap between the focus, included study designs, and type of IPD-MA for only those IPD-MAs that requested data from authors.D Shows overlap between the included study population, interventions/exposures, and outcomes of only those that requested data from authors.ACEIs = angiotensin-converting-enzyme inhibitors.ARBs = angiotensin II receptor blockers.BCG = Bacillus Calmette-Guérin.ECMO = extracorporeal membrane oxygenation.GBS = Guillain-Barré syndrome.MIS-C = multisystem inflammatory syndrome in children.Obs = observational.RCTs = randomised controlled trials.RT-PCR = reverse transcription polymerase chain reaction

Table 1 (continued) First author, last name Title Type, status, and availability of data for IPD-MA Focus Population Study design
ACEIs Angiotensin-converting-enzyme inhibitors, AD Aggregate data, AE Adverse event, ARBs Angiotensin II receptor blockers, ARDS Acute respiratory distress syndrome, BCG Bacillus Calmette-Guérin, ECMO Extracorporeal membrane oxygenation.EMRs Electronic medical records, GBS Guillain-Barré syndrome, ICU Intensive care unit, IL-6 Interleukin 6, IPD Individual participant data, IPD-MA Individual participant data metaanalysis, IVIG Intravenous immunoglobulins, MIS-C Multisystem inflammatory syndrome in children, N/A Not applicable, QoL Quality of life, RCT Randomized controlled trial, RT-PCR Reverse transcription polymerase chain reaction, SAE Serious adverse event, SARS-CoV-2 Severe acute respiratory syndrome coronavirus, VAP Ventilator-associated pneumonia