Skip to main content

Table 3 Strategies applied in research articles to counter issues of RHIS data

From: Using routine health information data for research in low- and middle-income countries: a systematic review

Type of strategy

Description of strategy

Missing data

 Exclusion

Exclude facility data if a certain threshold was reached (e.g. more than two-thirds of months in a year; more than a sixth of baseline data; facilities with any missing data)

Restrict analysis to a period with a low level of missing data

Sensitivity analysis to compare analysis of restricted period and full period

 Imputation

Assign missing observations with mean-value for the year

Assign missing observations with the average of precedent and subsequent data

Imputation using conditional autoregressive model

Missing value was replaced as positive (binary form) to prevent exaggeration of the fade-out effect

Sensitivity analysis of imputation strategies: 1) single imputation using means, trimmed means, and median, 2) Poisson generalized linear modeling, 3) iterative singular value decomposition method

 Interpolation

Interpolation using space-time kriging

Adjust results by dividing each indicator by the percentage of reports submitted

Adjust the data by calibrating to the total population using proportion reported in a household survey to have occurred in health facilities

 Verification

Account in the modeling method

Manual verification of the missing data with register at the health facility

Missing data was assumed missing at random and accounted for in the mixed-effect models using standard maximum likelihood estimation

Identifying extreme values

 Specific threshold

Establishing a lower and upper limit based on proportion of the annual average or feasible value

Univariate regression on individual facility-level to identify deviation from the mean time trend (e.g. if exceed 8 standard deviations)

 Visual

Visual inspection of outliers

 Analytic assessment

Jackknifing analysis to assess influence

Student residual higher than an absolute value of 2 and influence on the estimated coefficients determined by high Cook’s distance statistics

Handling of extreme values

 Exclusion

Extreme values were excluded from analyses

 Replacing extreme value with average

Extreme values were assigned the average value of the year; with exceptions of low average values

 Replacing extreme value with missing

Outliers set to missing

 Verification with data source

Any drastic change in monthly data reported electronically were manually verified with register at the health facility. Discrepancies were replaced with data in the register

 Discount observation in estimation

Outliers were allocated a dummy coding to discount the observation in the calculation of coefficients

Assess reliability

 Data validation process

Randomly selected 10% of the total sample to check accuracy and reliability of data with reports and registers

Verify data with another source (e.g. payroll)

Established routine data validation process by health information and records officer (e.g. monthly data review meetings)