Pragmatic applications of implementation science frameworks to regulatory science: an assessment of FDA Risk Evaluation and Mitigation Strategies (REMS) (2014–2018)

Background A Risk Evaluation and Mitigation Strategy (REMS) is a drug safety program for certain medications with serious safety concerns required by the U.S. Food and Drug Administration (FDA) of manufacturers to implement to help ensure the benefits of the medication outweigh its risks. FDA is encouraging “the research community to develop novel methods for assessing REMS,” conveying the unmet need for a standardized evaluation method of these regulatory-mandated healthcare programs. The objective of this research is to evaluate FDA REMS assessment plans using established implementation science frameworks and identify opportunities for strengthening REMS evaluation. Methods A content analysis was conducted of publicly available assessment plans for all REMS programs (N = 23) approved 1/1/2014–12/31/2018 for new drug applications (NDAs) and biologics license applications (BLAs) requiring FDA-mandated Elements to Assure Safe Use (ETASU). Blinded reviewers critically appraised REMS assessment measures (n = 674) using three established implementation science frameworks: RE-AIM (Reach, Effectiveness, Adoption, Implementation, Maintenance); PRECEDE-PROCEED (Predisposing, Reinforcing, and Enabling Constructs in Educational/Environmental Diagnosis and Evaluation – Policy, Regulatory, and Organizational Constructs in Educational and Environmental Development); and CFIR (Consolidated Framework for Implementation Research). Framework constructs were mapped to REMS Assessment categories as defined by FDA Guidance for Industry to evaluate congruence. Results REMS assessment measures demonstrated strong congruence (> 90% mapping rate) with the evaluative constructs of RE-AIM, PRECEDE-PROCEED, and CFIR. Application of the frameworks revealed that REMS assessment measures heavily emphasize implementation and operations, focus less on health outcomes, and do not evaluate program context and design assumptions. Conclusions Implementation science frameworks have utility for evaluating FDA-mandated drug safety programs including the selection of primary measures to determine whether REMS goals are being met and of secondary measures to evaluate contextual factors affecting REMS effectiveness in varying organizational settings. Supplementary Information The online version contains supplementary material available at 10.1186/s12913-021-06808-3.


Background
Risk Evaluation and Mitigation Strategies (REMS) are a drug safety program required by the US Food and Drug Administration (FDA) for certain medications with serious safety concerns to help ensure that the benefits of a medication outweigh its risks [1]. REMS are required when additional strategies beyond product labeling are needed to reduce the occurrence and/or severity of a specific risk to reinforce the medication's safe use conditions and behaviors [2]. Between 2014 and 2018, approximately 4% of all drugs and biologics were approved on the condition of having a REMS [3,4]. Table 1 presents a general REMS program overview. REMS programs address a specific drug-safety situation and risk mitigation goal, such as, preventing, decreasing the frequency or severity of the serious risk, and/or screening for the risk [5]. Pharmaceutical manufacturers (i.e., sponsors) are required to implement REMS programs based on FDA requirements and healthcare settings and providers need to adopt these requirements. REMS activities, as defined by FDA's regulatory authority, can include complex multi-level interventions known as elements to assure safe use (ETASU). Some ETASU target healthcare professionals and healthcare settings, requiring them to be certified to administer and dispense the medication or restrict the dispensing of a medication to a certain setting. Other ETASU may also include monitoring of patients, enrollment of patients in a registry, or certain packaging and safe disposal technologies. Approximately 80% of currently active REMS include at least one ETASU [6]. In addition to ETASU, other REMS strategies include dissemination of information such as materials in patient-friendly language delivered to patients (Medication Guide, Patient Package Insert) and communication materials to healthcare providers (Communication Plan) [7,8].
Program assessment is required for all REMS approved for drugs or biologics under New Drug Applications (NDA) and Biologics License Applications (BLA) and encompasses program participation, outcome and impact measures. Sponsors are required to conduct assessments of each REMS program and provide them to the Agency at defined timetables to determine whether the REMS is meeting its risk mitigation goal(s) [1,9]. These reports should provide information and data that are based on the REMS assessment plan, which is a list of metrics on outreach and communications, knowledge, safe use behaviors, implementation and operations, and health outcomes that drug sponsors need to address. While the statute requires a timetable for submission of an assessment of the REMS, it does not specify how those assessments should be conducted [10].
Since the REMS authorities went into effect in 2008, there has been increased focus on the standardization of REMS assessments. In 2013, the Office of Inspector General (OIG) report urged the FDA to "identify and implement reliable methods to assess the effectiveness of REMS." [11] In that same year, FDA hosted a Standardizing and Evaluating REMS Public Meeting under its Prescription Drug User Fee Act (PDUFA) V commitments to obtain input on working towards "an evidencebased approach to assessing the effectiveness and burden of REMS" and "to identify best practices to incorporate into future REMS design, as well as appropriate ways to standardize REMS tools and integrate REMS into the health care delivery system." [12] In January 2019, the draft guidance for industry on REMS Assessment: Planning and Reporting (henceforth referred to as the Assessment Guidance) calls upon "applicants and the research community to develop novel methods for assessing REMS." [8].
Implementation science has been widely applied to health related research to support large-scale evaluations of evidence-based practices and organizational interventions in real-world settings [13]. FDA-mandated drug safety programs similarly seek to integrate processes into the larger healthcare system by targeting individual behaviors and organizational activities; implementation science can lend its theories, frameworks, and research to the more narrowly focused field of pharmaceutical regulation. Viewed through a regulatory risk management lens, implementation is analogous to risk mitigation. Implementation frameworks provide a visual display of a program to help guide one's thinking and development of the program. The current landscape presents an opportune time to continue the progress made and borrow from the field of implementation science to advance the science of REMS assessment and the field of pharmaceutical risk management [14,15]. Since 2014, Smith and Morrato have argued that minimal attention to frameworks in the design of risk minimization programs has hampered the identification of factors contributing to their success [14,15].
Previous work in recent years have called for increased transparency and standardization of the evaluation of pharmaceutical risk minimization programs. Members of the International Society for Pharmacoepidemiology, Smith et al. used principles from public health intervention design and evaluation to develop a quality reporting checklist informed by a framework from program theory and process evaluation [16]. The resulting RIMES statement is usable for regulators and the pharmaceutical industry and emphasizes the utility of similar standardized reporting to design higher-quality risk minimization evaluation studies and ultimately improve the quality and effectiveness of risk minimization programs. More globally, the European Medicines Agency has continued publishing guidelines on risk minimization measures (RMMs) specific to evaluating the effectiveness of outcomes identified. These discuss qualitative and quantitative data sources and methodologies [17].
The purpose of this research is to evaluate three different implementation science frameworks and their applicability for REMS assessments. This work expands on previous research characterizing REMS assessment plans [18], by comparing and contrasting additional frameworks and by expanding the analysis to include Shared System REMS involving multiple application holders including generics (abbreviated new drug applications or ANDAs). The aim of this structured analysis is to advance the science of REMS assessment and promote the uptake of a scientific approach for program design and evaluation of global pharmaceutical risk management plans. Furthermore, this research has implications for the FDA to consider the application of implementation science frameworks when finalizing its REMS assessment guidance.

Selection of frameworks
Selection of eligible frameworks was done through a repository of dissemination and implementation frameworks found through the NIH's Office of Disease Prevention at dissemination-implementation.org [19]. One member of the research team (LH) defined criteria for selecting frameworks with most effective applicability to REMS. Frameworks had to be in the fields of health, public health, or health services. Because REMS programs encompass risk mitigation strategies, frameworks were selected if they focused on either implementation alone or dissemination and implementation equally. Due to the nature of most REMS programs as multi-level interventions, frameworks also had to characterize the individual, organization, and/or community levels. To operationalize these frameworks into a REMS context, it was appropriate for them to score at least a '3' out of 5 on construct flexibility, with a '1' being broad or flexible in definition and a '5' being detailed with step-by-step actions. Finally, as REMS are FDA-required programs, it was also appropriate to select frameworks that were United States-based.
Five frameworks met inclusion criteria, with two of them being derivations of another two. To select the most applicable frameworks for our research, we narrowed these five down to the top three that were most well-established and supported in the literature, each representing different schools of thought. The resulting frameworks are as follows: RE-AIM (Reach, Effectiveness, Adoption, Implementation, Maintenance) from implementation science, PRECEDE-PROCEED (Predisposing, Reinforcing Enabling, Construct in, Educational, Diagnosis and Evaluation -Policy, Regulatory, Organizational, Construct in, Educational and Environmental, Development) from health program planning and evaluation, and CFIR (Consolidated Framework for Implementation Research) from clinical quality improvement [20][21][22].
To identify current initiatives in the space intersecting implementation science and risk management, a literature review was conducted in January 2019 to search for articles relating REMS to dissemination and implementation science frameworks using three databases: PubMed, Web of Science, and EMBASE. The database searches were updated August 2020. We inputted search strings consisting of common implementation science and REMS terminology. Examples of these search strings include "risk evaluation and mitigation strategies" AND "implementation framework." A more complete list of these terms can be found in Additional file 1. In January 2019, these searches produced no articles and only one abstract relating RE-AIM to a specific REMS program [23]. The updated August 2020 search produced one article [18].

Application of frameworks to REMS assessment plans
A content analysis of REMS assessment plans was conducted for REMS programs approved by the FDA between 1/1/2014-12/31/2018 [24]. With the first REMS approved in 2008, this timeframe was selected to align with more current REMS approvals. REMS assessment plans were eligible if they were: (1) for a new drug application (NDA) or a biologics license application (BLA) and (2) included ETASU (Table 2). Shared System REMS, which reflect multiple products, including generics, of the same class or molecular moiety under two or more sponsors [25] were included in the analysis. However, REMS comprised of only Communication Plans and/or Medication Guides were excluded from our analysis due to our focus on complex, multi-level, multi-system interventions. REMS assessment plans can be publicly accessed for individual programs through the REMS@FDA website, which link to the Drugs@FDA website, where each drug approval letter containing a list of metrics for the REMS can be found. The unit of analysis of this research focused on REMS assessment items as they are metrics for evaluating the performance of a REMS towards meeting its risk mitigation goals.
Starting with the RE-AIM framework, three reviewers (LH, GT, EM) created construct definitions applicable to REMS assessments by adapting from RE-AIM dimensions defined by the framework (Table 3). After adjudicating these applications for three randomly selected plans (IRR = 75%), the review team then refined the definitions accordingly. Two reviewers (LH and GT) categorized each assessment item (n = 674) for the remaining 20 plans. This adaptation, adjudication, and refinement process was repeated for PRECEDE-PROCEED (IRR = 79%) and CFIR (IRR = 84%) (Tables 4, 5 and 6).
Shortly following the initiation of this research, the draft Assessment Guidance was published in January 2019 [8]. This draft Assessment Guidance outlined five categories that were intended to capture REMS program outcomes and processes. To evaluate the utility of the three frameworks on REMS assessments, these RE-AIM dimensions, PRECEDE-PROCEED phases, and CFIR constructs were then mapped to the Assessment Guidance categories: Program Outreach and Communication, Program Implementation and Operations, Knowledge, Safe Use Behaviors, and Health Outcomes and/or Surrogates of Health Outcomes (Additional file 2). For simplicity, the dimensions of RE-AIM, phases of PRECEDE-PROCEED, and constructs of CFIR will be collectively referred to as "constructs" hereafter. Because REMS programs are focused on achieving its goals through information, education, and/or reinforcement of actions, it was appropriate for us to combine knowledge and safe use behaviors into one assessment category to map to the constructs.
Results were reported as aggregate summary statistics for the frequency distribution across all programs and the number of assessment measures per Assessment Guidance category. Finally, sensitivity analysis was performed to examine qualitative differences by type of application (e.g., drug vs. biologic), type of ETASU, and trends over time. A subgroup analysis for Shared Systems was done in particular. For each framework, descriptive statistics were calculated to determine the proportions each construct was represented per REMS program. Each construct was analyzed for the number of REMS programs addressing the construct, the median number and range of measures representing that construct per REMS program, and the number of measures representing the construct across all programs. Additionally, each assessment item was assessed for inclusion of the Assessment Guidance categories using the mapping of each frameworks' constructs.

Results
A total of 23 REMS programs consisting of nine BLAs, nine NDAs, and five Shared Systems were selected for analysis based on the eligibility criteria (Additional file 3). Characteristics of these programs at the time of their original REMS approval can be found in Table 2. Programs requiring a REMS varied by indication, including but not limited to B-cell lymphoma, acute pain requiring an opioid analgesic, and multiple sclerosis. Likewise, the risks intended to be mitigated by the REMS varied widely from neurological toxicities to respiratory depression and death. The number of REMS programs by year ranged from three in 2016 and 2018 to seven in 2017. Assessment measures per REMS program ranged from 8 to 71, for a total of 674 assessment measures across the 23 programs.
Subgroup analysis of Shared System REMS reveals they account for the minimum and maximum values of the number of assessment measures per REMS program, with emtricitabine tenofovir disoproxil fumarate containing 8 measures and sodium oxybate containing 71 measures. While the fewest number of assessment measures were associated with programs containing only ETASU A, no other associations can be made between ETASUs and assessment measures. REMS approval for Shared Systems mostly occurred in the latter years, with three programs approved in 2017 and one each in 2015 and 2016. This is reasonable because Shared System REMS reflect sustaining programs requiring multiple sponsors of the reference listed drug and ANDA to work together on the design and development of the REMS, a process requiring a substantial amount of time.

Insights from the application of RE-AIM to REMS
From the participating pool of stakeholders comprised of patients, healthcare providers, pharmacies, wholesalers/distributors, and healthcare setting, defining RE-AIM constructs for application to REMS required the agent and recipient of the program to be identified. Recipients refer to patients or other populations who are targeted to benefit from the program outcomes produced, while agents are defined as people who deliver the program. Others have applied effectiveness to providers and the healthcare setting as well as implementation to patients [23,26]. For our analysis, we defined patients as the recipients of the REMS program and providers, pharmacies, and wholesalers as the agents. Subsequently, the Reach and Effectiveness constructs applied to patients, Adoption and Implementation to providers/ pharmacies/wholesalers, and Maintenance to any participant.
REMS assessment measures demonstrated the strongest congruence with the RE-AIM framework. All five RE-AIM constructs were represented with REMS assessment measures. Of 674 assessment measures across the 23 programs, only 4 measures (0.6%) could not be mapped to a single RE-AIM construct because either the intent of the assessment item was unclear or there were multiple intents of the assessment item, making it categorizable into multiple constructs. For example, we defined Reach to refer to the number of patients eligible to receive the drug who participate in the program and Adoption to refer to the number of agents (e.g., prescribers) involved in adopting and implementing the program (Table 3). Therefore, items assessing the number of prescriptions in compliance with REMS requirements were too ambiguous to classify to a single REMS domain because prescriptions could either reflect prescriber-compliant participation in the program through prescriber certification (Adoption) or the number of patients who ultimately received the medication (Reach).
Of 23 total REMS assessment plans, 19 (82.6%) contained measures assessing Reach, with the median number of Reach assessment measures per assessment plan  Similarly, the median number of Reach assessment measures per Shared System assessment plan was also 2 (range 0-4). Notably, the Shared System REMS for emtricitabine tenofovir disoproxil fumarate contained the highest proportion of measures assessing Effectiveness, with this construct accounting for 37.5% of all assessment items. Adoption was similarly assessed among the Shared System REMS with the median number of assessment measures being 1 (range 1-5). The assessment of Implementation measures among Shared System REMS was comparable to that of other REMS programs, with the median number being 14 (range 4-52). Similar to the scarcity of other programs, only 1 Shared System REMS contained assessment measures of Maintenance, with the median number of assessment measures per plan being 0 (range 0-1). Shared System REMS shared the median number of Maintenance assessment measures compared to other programs, with this figure being 0 (range 0-1).
Categorizing the REMS assessment measures to the Assessment Guidance was also a feasible task (Fig. 1). Measures of Reach and Adoption were categorizable to Outreach and Communications, depending on whether the patient or provider was targeted. Those of Effectiveness were mapped to Health Outcomes if the item assessed adverse events or health risks and to Safe Use Behaviors and Knowledge if the item assessed patients' understanding of the risks. Similarly, REMS assessment measures falling into the RE-AIM Implementation construct were categorizable to the Assessment Guidance's Implementation and Operations category if they were process measures and into Safe Use Behaviors and Knowledge if they assessed providers/pharmacists' understanding of the risks. Finally, although the Assessment Guidance does not include Maintenance as a category, some programs included a measure of Maintenance on their assessment plans.

Insights from the application of PRECEDE-PROCEED to REMS
As a health program planning and evaluation model, PRECEDE-PROCEED provides structures for both specifying objectives and baselines before the intervention as well as for monitoring and continuous quality improvement after the intervention. However, as REMS include implementation and evaluation aspects, we did not apply PRECEDE and only defined the PROCEED constructs as applicable to REMS. Instead, we interpreted the "design intervention" step of the Implementation construct to assume that design assumptions from the PRECEDE constructs are met. This aligns to the nature of the PRECEDE-PROCEED framework, wherein change begins with the outcome, and the process moves backward logically to achieve the desired result in a formative process. Process Evaluation mapped to REMS assessment measures attributable to "the system," while Impact Evaluation mapped measures relating to "the individual." For constructs consisting of multiple components in the original definition, we divided these up into subconstructs when applying the definitions to REMS. PRECEDE-PROCEED exemplified strong congruence to REMS assessment plans, with seven assessment measures not represented by the framework, as stakeholder engagement is not accessed in the PROCEED constructs. Therefore this 1% of REMS assessment measures that was not mapped to PRECEDE-PROCEED was related to antecedent outreach factors, such as sources of distribution lists. Not only were all PROCEED constructs represented by REMS assessment measures, but so were all   (Tables 4  and 5). Of 23 total REMS assessment plans, 18 (78.3%) contained measures assessing Implementation, with the median number of Implementation assessment measures per assessment plan being 4 (range 0-6). For Process Evaluation, all 23 (100%) REMS assessment plans included this construct, with the median number of assessment measures per plan being 12 (range 1-26). All 23 (100%) REMS assessment plans measured Impact Evaluation, with the median number of assessment measures per plan being 10 (range 3-34). Finally, 12 (52%) assessment plans measured Outcome Evaluation, with a median number of assessment measures per plan being 1 (range 0-6).
For Shared System REMS, the median number of Implementation assessment measures per plan was 1 (range 0-5). The median number of Process Evaluation measures per assessment plan was 10 (range 1-26). Notably, the Shared System REMS for emtricitabine tenofovir disoproxil fumarate contained the highest proportion of measures assessing Impact Evaluation, with this construct accounting for 87.5% of all its assessment measures and the median number of assessment measures  Costs of the REMS program and costs associated with implementing the program including investment, supply, and opportunity costs.

A. Patient Needs & Resources
The extent to which patient needs, as well as barriers and facilitators to meet those needs, are accurately known and addressed by the organization.
The extent to which the REMS program is patient-focused and that patient needs, as well as barriers and facilitators to meet those needs, are accurately known and addressed by the sponsor.

Not included Not applicable B. Cosmopolitanism
The degree to which an organization is networked with other external organizations.
The extent and quality to which stakeholders are networked within the broader healthcare system to more quickly implement practices.  The nature and quality of webs of social networks and the nature and quality of formal and informal communications between sponsors and their vendors.
The strength of formal and informal communications, networking, and relationships between sponsors, vendors, and points of care and their effects on the adoption of the REMS program and understanding of its goals.
Not included Not applicable C. Culture Norms, values, and basic assumptions of a given organization.
The norms, values, and basic assumptions about risk management and the REMS program at the point of care and the extent of how relatively stable, subconscious, and socially constructed these are.

D. Implementation Climate
The absorptive capacity for change, shared receptivity of involved individuals to an intervention, and the extent to which use of that intervention will be rewarded, supported, and expected within their organization.
The absorptive capacity for change, shared receptivity of involved individuals to the program, and the extent to which use of the REMS will be rewarded, supported, and expected through policies, procedures, and systems within the points of care.

Not included Not applicable E. Readiness for Implementation
Tangible and immediate indicators of organizational commitment to its decision to implement an intervention.
Tangible and immediate indicators of stakeholders' readiness for adoption of the REMS in terms of setting, culture, leadership, and evaluation.

A. Knowledge & Beliefs about the Intervention
Individuals' attitudes toward and value placed on the intervention as well as familiarity with facts, truths, and principles related to the intervention.
Participants' attitudes toward and value placed on the REMS as well as familiarity with facts, truth, and principles related to the program, including sufficient knowledge of the necessity for and skill of executing the REMS program.

Knowledge
Measures of the extent of stakeholders' knowledge about the REMS-related risk or knowledge of any safe use conditions that are needed in order to mitigate the risk B. Self-efficacy Individual belief in their own capabilities to execute courses of action to achieve implementation goals.
Participants' belief in their own capabilities to execute courses of action to achieve REMS goals.
Not included Not applicable

C. Individual Stage of Change
Characterization of the phase an individual is in, as he or she progresses toward skilled, enthusiastic, and sustained use of the intervention.
Characterization of the phase a participant is in and additional strategies necessary for the skilled and enthusiastic maintenance of behavior.

Safe Use Behaviors
Measures of the extent to which safe use conditions are being adopted or followed per plan being 9 (4-34). Finally, only 2 Shared System REMS contained assessment measures of Outcome Evaluation, with the median number of Outcome Evaluation measures per plan mirroring that of the larger group at 0 (range 0-6).
Of the three frameworks, assessment items mapped to PRECEDE-PROCEED were most evenly distributed across the constructs. (Fig. 1). PROCEED constructs mapped most directly with those of the assessment guidance, with Implementation to Outreach and Communications, Process Evaluation to Implementation and Operations, Impact Evaluation to Safe Use Behaviors and Knowledge, and Outcome Evaluation to Health Outcomes.

Insights from the application of CFIR to REMS
Developed out of 20 sources including the Diffusion of Innovations Theory, CFIR has been used for quality improvement [21]. Designed to provide a menu of constructs to be adaptable to a variety of applications, only the Characteristics of Individuals and Process constructs were most appropriate to REMS. Specifically, only the Knowledge and Beliefs about the Intervention and Individual Stage of Change under the Characteristics of A broad construct to include other personal traits such as participants' locus of control and other psychological concepts related to REMS implementation.
Not included Not applicable

A. Planning
The degree to which a scheme or method of behavior and tasks for implementing an intervention are developed in advance, and the quality of those schemes or methods.
The degree to which a scheme or method of behavior and tasks for implementing the REMS are developed in advance, and the quality of the evidence supporting those steps to promote effective implementation.
Not included Not applicable B. Engaging Attracting and involving appropriate individuals in the implementation and use of the intervention through a combined strategy of social marketing, education, role modeling, training, and other similar activities.
Carefully and thoughtfully attracting involving appropriate representatives from each stakeholder group in the implementation of the REMS program through a combined strategy of social marketing, education, role modeling, training, and other similar activities to meet participants' needs.

Program Outreach and Communication
Measures of the extent to which the REMS materials reached the intended stakeholders C. Executing Carrying out or accomplishing the implementation according to plan.
Carrying out or accomplishing the REMS program according to plan (descriptive).

Program Implementation and Operations
Measures of the extent to which the intended stakeholders are participating in the program, how effectively the REMS program is being implemented and any unintended consequences such as patient access or burden to the healthcare system

D. Reflecting & Evaluating
Quantitative and qualitative feedback about the progress and quality of implementation accompanied with regular personal and team debriefing about progress and experience.
Quantitative and qualitative feedback (evaluative) about the progress and quality REMS implementation accompanied with regular personal and team debriefing about progress and experience.

Program Implementation and Operations
Measures of the extent to which the intended stakeholders are participating in the program, how effectively the REMS program is being implemented and any unintended consequences such as patient access or burden to the healthcare system It is critical to note that CFIR is evidence-based, assuming the intervention is effective, and therefore does not measure outcomes. Given this, 5% of REMS assessment measures (34 of 674) did not map to the framework because they were outcome measures. Due to CFIR's multilevel nature, it was important to define the "Inner Setting" to refer to individual sites of specific REMS programs, and "Outer Setting" to refer to the broader healthcare system inclusive of patients. A criticism of CFIR is the fact that its "combined breadth and depth is not always feasible for implementation," [27]; however, its design for clinical quality improvement meant that it was easily customizable to a variety of REMS situations across diverse settings. In this context, "Characteristics of Individuals" was determined to represent any stakeholder (e.g. patient or provider) within that setting.
Of 23 total REMS assessment plans, 1 (4.4%) contained measures assessing Intervention Characteristics (  (Fig. 1). The Process construct of Engaging mapped to Outreach and Communications, while both Executing and Reflecting and Evaluating mapped to Program Implementation and Operations. Regarding the Program Implementation and Operations constructs, Executing is akin to the descriptive analysis, or descriptions of the aspects of the REMS implementation process, while Reflecting and Evaluating refer to the evaluative and causal analysis, or the corrective and preventive actions in REMS and the determination of the reasoning behind certain implementation processes.

Discussion
To our knowledge, this is the first systematic comparative content analysis of several leading implementation science frameworks and their relative applicability and feasibility for nationally-regulated pharmaceutical risk minimization program assessment. Our evaluation found three established implementation science frameworks to be pragmatic in utility, making application to REMS programs relatively intuitive.
For example, RE-AIM considerations for evaluating health promotion programs and policies fit well with the need to evaluate REMS program recipients (like prescribers) who are agents for delivering the program to others (like patients). Application of the PROCEED constructs was also straightforward, especially with Implementation and Outcomes Evaluation, which directly matched REMS assessment categories. Overall, all three frameworks had very strong congruence with REMS assessment items, with at least 95% of items mapping to a single construct for each of the frameworks.
Comparison of RE-AIM, PRECEDE-PROCEED, and CFIR for evaluating REMS programs Our application of implementation science to the Assessment Guidance found that most REMS assessment measures for programs approved between 2014 and 2018 fell under Implementation and Operations, which is reasonable given FDA's authority to require REMS and its experiences reviewing REMS implementation by the sponsors. Additionally, process data are understood to be more sensitive measures of the quality of a program than outcome data, as a poor outcome does not always occur as a result of an error in the provision of care [28]. Our analysis also found less emphasis on REMS assessment measures related to Health Outcomes. This may be because the FDA relies on good pharmacovigilance practices and pharmacoepidemiologic assessment, versus the REMS assessment itself, to collect information on safety events [29]. Interestingly, a Shared System REMS was more likely to include measures related to Health Outcomes or Safe Use Behaviors and Knowledge. This may be explained by the fact that Shared System REMS reflect sustaining programs, where processes had been implemented for some time, allowing for more opportunity to evaluate sustained effectiveness on population health outcomes. Table 7 compares and contrasts the key strengths and limitations of RE-AIM, PRECEDE-PROCEED, and CFIR as applied to REMS programs. Based on our assessment, no single unifying framework is likely applicable for all FDA-mandated programs. Rather, each framework merits relative scientific strengths and selection should be tailored based on the need of the specific program. The evaluation also identified opportunities to further strengthen REMS program evaluation, including measures to assess institutional implementation climate (from CFIR); measures to assess the representativeness of program participation and sustainability (from RE-AIM); and measures to assess health care provider and patient values and preferences (from PRECEDE-PROCEED). The following describes the relative strengths and limitations in greater detail:

RE-AIM
As a program evaluation framework, RE-AIM aims to "improve sustainable adoption and implementation." [26] One strength was how simple-to-use RE-AIM was and how easily adaptable it was to a spectrum of REMS assessment measures. Another was that it considers the representativeness of patients and providers [18]. This inclusion of participants' characteristics allows for the assessment of heterogeneity of impact, which subsequently permits evaluations of patient burden and access. Health outcomes can be better measured as suggested by RE-AIM through predetermination of objectives and their impact on the changes in knowledge and safe use behaviors. REMS assessment could be strengthened by more deliberate inclusion of the RE-AIM Maintenance construct. For example, assessment measures could evaluate the evidence for integration of REMS processes and procedures into institutional policies, attrition of healthcare providers over time, or the extent to which health outcomes are sustained as a result of patient educational training [30], generating evidence for the decision of potentially releasing a REMS. If FDA started with RE-AIM to develop their REMS assessments, there could be a more even distribution of items, with more on sustainability of health care setting processes and individual behaviors and less on fidelity to every single process measure.

PRECEDE-PROCEED
Although intuitive to apply, not every construct mapped perfectly to a REMS measure. Some measures mapped to multiple constructs, while others did not map to a single construct. The framework does not specifically analyze outreach to providers, although it does focus on community and/or patient needs in general. A strength of the framework is that it thoroughly considers assessment of situational factors that may affect the outcome of an intervention. PRECEDE-PROCEED suggests assessment plans could make note of questions made to the REMS call center that could shed light on the differences in the tolerance for burden between individuals and subgroups. However, it is important to note that the framework does not assess resources and therefore may not be a good model for evaluating program burden. One way to strengthen the application of PRECEDE-PROCEED for REMS is to create subconstructs within PROCEED constructs as we did and assign each by participant; for example, subconstruct 7B of the Process Evaluation construct would apply only to healthcare provider measures, while subconstruct 7C also of the Process Evaluation construct would apply to patients.

CFIR
Our application of CFIR to REMS assessment found the framework to "open[] the 'black box' of … implementation" and be fruitful in evaluating implementation progress in a clinical setting, as the authors intended [31]. We were able to select constructs most relevant for REMS assessments, namely those falling under Process and Characteristics of Individuals. Some of these measures involved Reflecting and Evaluating on program implementation to improve processes. CFIR can also be used as a guide to formative evaluations for FDA to improve REMS assessments by considering constructs under Intervention Characteristics, the Outer Setting, and the Inner Setting. Through its 39 different subconstructs, CFIR can also inform the design and conduct of REMS. For example, on the construct of Intervention Characteristics, Adaptability can be measured in the earlier REMS assessment plans to evaluate the degree to which REMS requirements must be tailored and refined to meet local needs and sponsor capacities. In the Outer Setting, Patient Needs and Resources can be assessed to the extent that the REMS program is patient-focused and addresses patient barriers. Measures of participants' knowledge about the REMS-related risk or of any safe use conditions that are needed to mitigate the risk can be aided by assessment of the participant's values and attitudes towards that knowledge or safe use behavior.
Participants' Readiness for Implementation including physical resources, leadership engagement, and culture can also be assessed under the Inner Setting. Validating design assumptions would then allow for more thoughtful modifications and ultimately improve program performance. If FDA develops REMS assessment measures starting from CFIR, there would be considerations about the design aspects of the REMS characteristics, thoughts about how the REMS would fit into the broader healthcare setting, and participants' receptiveness about the REMS.

Implications for regulated risk management plans
The use of frameworks offers several advantages to advance the science of pharmaceutical risk minimization program evaluation. First, grounding assessment plans through frameworks encourages consistency, standardization, and completeness in evaluating REMS to facilitate cross program comparisons and foster generalizable knowledge. Second, frameworks are valuable for building the evidence base for synthesizing implementation strategies to improve outcomes. They can be used to drive data towards more meaningful information and increase the likelihood that REMS design, implementation, and outcomes are effective. Currently, setting, context and implementation strategy selection are under-reported in published evaluations making it difficult to compare effectiveness [32]. Conceptual models have been proposed for developing efficient strategies for the measurement of the effectiveness of European Medicines Agency RMMs [33]. Like REMS, RMMs can also benefit from the application of implementation science in the aspects of the delivery context, attributes of the proposed intervention, and characteristics of the intended adopters [15].
Although the present evaluation focused on using implementation science frameworks for REMS assessment, these frameworks can also be used to inform REMS design. For example, the goal of the Addyi REMS program is to mitigate the increased risk of syncope and hypotension associated with drinking while taking the medication. Because patients' current and past drinking behavior must be assessed before Addyi is prescribed, a framework focused on changing health behavior might be most suitable [34]. PRECEDE-PROCEED considers the spectrum of behavior change from the predisposing, enabling, and reinforcing factors to the impact of the program on these factors and could therefore inform a REMS program like Addyi that is designed to change behavior. REMS programs with health education as its main intent, such as the Opioid Analgesic REMS, could benefit from using RE-AIM as this framework not only poses questions related to assessing individuals reached and the immediate and long-term effectiveness of the knowledge transfer but also how the setting and operations could facilitate or impede the change in knowledge [35]. Finally, as CFIR focuses on implementing evidencebased interventions with demonstrated effectiveness, it is a suitable framework for a Shared System REMS such as that of bosentan where the goal is to sustain implementation effectiveness as new drugs or generics enter the Shared System REMS [36].
Lastly, under FDA's PDUFA V commitments to modernize post-marketing drug safety evaluation, it has committed to efforts on implementing a structured benefit-risk assessment process [37]. The benefit-risk assessment framework has identified "Risk and Risk Management" as one explicit dimension, with an area of interest concerning how information on evidence and uncertainties can be communicated to the public. Implementation science frameworks can be used to ensure that risk management uncertainties are systematically evaluated and categorized based on factors known to affect real-world implementation of health programs. In turn, this level of rigor in a policy context facilitates the selection of interventions with clinical benefit for healthcare organizational settings to ultimately improve patient health.

Conclusions
As FDA considers feedback from stakeholders and public comments in its finalization of the Assessment Guidance, this research can serve as one source of input. This research demonstrates the feasibility of implementation science frameworks to be applied to REMS assessment plans. Frameworks such as RE-AIM, PRECEDE-PROCEED, and CFIR provide a logical, structured approach for determining what should be measured, when they should be measured, and the process and impact indicators for facilitating these measurements. Application of implementation science to REMS assessment measures reveals a need to consider the design and sustainability of REMS programs in the assessment plans. Future REMS assessment plans can consider an element from each of the three frameworks adapted, including patient values and preferences, representativeness, and the implementation climate. Using this information, sponsors can evaluate matters of stakeholder interest, such as patient burden and access through assessment of heterogeneity. Burdens on patients and the healthcare system can be further reduced by determining core explanatory measures for every step of the evaluation continuum to prevent unnecessary interventions when one core primary measure would be sufficient for determining whether the REMS goal could be met.