In this study the feasibility and reliability of the TiC-P were evaluated. Additionally, the construct validity of two items of the TiC-P was assessed. The small number of missing data indicated that the questionnaire was generally well understood. The response rate was at an acceptable level (72%). The average completion time of the questionnaire (on average below 10 minutes) was acceptable. The test-retest analyses showed good agreement on items of medical resource use that were frequently reported by the respondents. Absolute agreement between reported data on contacts with psychotherapists and long-term absence from work and registration data was around 70-75%. Additionally, the construct validity of the items on contacts with psychotherapists and long-term absence from work was satisfactory.
Thus, the TiC-P, including the SF-HLQ, seems a feasible and reliable instrument for measuring healthcare utilization and productivity loss. Additionally, the findings regarding the construct validity of the items ‘contacts with psychotherapist’ and ‘long term absence from work’ are promising.
Qualitative feedback on the questions was evaluated on the draft version of the TiC-P. For the current study, feasibility was operationalized as the response rate, the time for filling out the questionnaire and data completeness. Generally, patients in the Monitoring study filled out the questionnaire at the healthcare setting. This may have positively influenced the response rate and data completeness of the questionnaire in comparison to filling out the TiC-P at home. Proportions of missing values on the items were small with exception of missing values related to the measurement of the use of medication. The findings on incompleteness of self-reported data on medication are in line with findings in another study among elderly patients
. Cost calculation of medication requires relatively comprehensive information of the respondents. It may be feasible to reduce the number of questions on medication using daily defined dose for the calculation of the costs. If costs of medication are expected to contribute substantially we recommend considering alternative sources for collecting these data, e.g. patient records.
The relatively easy way of filling out questionnaires online may present an underestimation of the time needed for filling out the paper version of the questionnaire. Also, the relatively young population participating in the Monitoring study may underestimate the time to fill out the TiC-P in general. Additionally, in the online version respondents were pointed out to the missing quantification in case of reporting ‘yes’ on the items. Despite this, respondents were able to ignore the automatic signal and continue the questionnaire. However, this might have decreased the number of missing values.
Consistency between test and retest measurement was relatively high. Generally, the reliability of test retest analyses require similar circumstances of the successive measurements and an appropriate time interval between the measurements moments. For this study we chose to send the retest questionnaire after t=1 (the second measurement) since we assumed that changes in medical consumption and productivity losses are limited at the start of therapy. A number of 180 respondents were invited for filling out a retest questionnaire. The response rate was relatively low. Additionally, 10% of the retest questionnaires were filled out after a relatively long period. Due to the relatively small sample of the retest, the generalizability and the interpretation of the results warrant some caution. Despite of this, the ICCs related to contacts with healthcare providers that were contacted most frequently (i.e. > 12% of the respondents) seem satisfactory and have relatively small confidence intervals. A number of healthcare providers who were contacted less frequently (e.g. contacts reported by < 12% of the respondents) had lower agreement scores. Consequently, whether the frequencies of these contacts are relatively constant over time may be argued. More research is necessary to further assess the reliability of these items.
The reliability of short-term absence from work was satisfactory. Test and retest figures related to the number of days at work while impeded by health problems were moderate. It can be assumed that it is more complicated to remember the exact number of days at work with impediment. However, alternative methods for measuring reduced efficiency are currently not available. More research on this topic is warranted.
Agreement between reported and registered data on the number of contacts with psychotherapists was satisfactory, indicating an acceptable construct validity of this item. This finding is in line with other studies indicating that self-reports provide accurate data for more important and for less frequent events
[9, 22]. It was not possible to study resource utilization other than contacts with psychotherapists who participated in the clinical study. Consequently, further research is necessary for assessing the construct validity of measuring the other items of health care utilization. Agreement between the number of days of absence from work based on data derived from the occupational health service and self-report of the patient was satisfactory. Our results are in line with previous studies of patient-reported absence from work
[23–25]. A limitation of our study is that we were only able to compare reported data with registered data on long-term absence from work. However, Severens et al.
 found that 95% of patient reported data matched the registered data on absence from work perfectly applying a recall period of 2 and 4 weeks.
Another limitation is that our study was performed in patients with mental disorders treated in ambulant settings e.g. among patients with less severe mental disorders. We expect that the findings in our study can be generalized to other groups of patients. However, future research on this is desired. Currently, the default version of the TiC-P uses a recall period of three months for measuring medical consumption and one month for measuring productivity losses is applied. This interval is in line with commonly applied measurement intervals in clinical studies. In the Monitoring study, a recall period of four and two weeks was applied respectively. This may limit the generalizability of the results of this study for the current version of the TiC-P. Further research should indicate the impact of a longer recall period on the different items of the TiC-P.
Finally, the current version of the TiC-P is translated by a professional language institution into a patient-friendly version using more simple language. We assume that this will enhance the feasibility and validity of the TiC-P.