Evidence Needed to Support Measurement Equivalence between Electronic and Paper-Based PRO Measures Comments

ePRO Reviewer Group Comments on Task Force Manuscript

Thank you for the opportunity to review this draft manuscript. I think that it is very thorough and provides a great resource for people interested in the migration of PROs to ePROs.

One suggestion that I would like to put forward is to increase the number of tables or visual items. For example, it may be helpful to provide a table with the breakdown of the types of ePRO collection devices (e.g., page 4).  Also, on page 5, second paragraph, I imagine that the working group may have considered this but would it be worthwhile to mention that web-based systems frequently offer an advantage with respect to visibility of the survey, due to larger-sized screens?

 Thanks and congratulations on a great effort!


The overall draft looks great.  In the manuscript there is a sentence on page 16 (2nd last paragraph)   The study subjects should be representative of the target population or intended patient group in which the ePRO will be used, particularly in regard to age, gender, and disease severity. I am wondering whether "race/ethnicity" of the target population would be considered.

Also, on page 19 (last paragraph) - the reference indicated here is "Feldt LS, Woodruff KJ, Cicchetti DV. High agreement but low kappa: I. The problems of two paradoxes. 1990;43:543-549".

I see that the authors for above article are Feinstein AR and Cicchetti DV. I am wondering whether following is the article you wished to cite.

1: J Clin Epidemiol. 1990;43(6):543-9.Right-click here to download pictures. To help protect your privacy, Outlook prevented automatic download of this picture from the Internet. Click here to readLinks

Comment in:

J Clin Epidemiol. 1992 Dec;45(12):1452.

High agreement but low kappa: I. The problems of two paradoxes.

Feinstein AR, Cicchetti DV.

Yale University School of Medicine, New Haven, CT 06510.

In a fourfold table showing binary agreement of two observers, the observed proportion of agreement, p0, can be paradoxically altered by the chance-corrected ratio that creates kappa as an index of concordance. In one paradox, a high value of p0 can be drastically lowered by a substantial imbalance in the table's marginal totals either vertically or horizontally. In the second paradox, kappa will be higher with an asymmetrical rather than symmetrical imbalanced in marginal totals, and with imperfect rather than perfect symmetry in the imbalance. An adjustment that substitutes kappa max for kappa does not repair either problem, and seems to make the second one worse.


I think the text is very good, and I would be very interested if you discussed the translation and adaptations between countries a little further.

What I miss is a special topic about the objective of the study, even though you wrote its purpose in the introduction.  Since you have an introduction, methods and discussion/conclusion, you could also have de objective as well.

I am primarily interested in the UAT section.  UAT is a software development process and I've worked in healthcare software development for 30 years, half of which have been on the "e" side of ePRO, so I know UAT very well.  It appears there are misunderstandings in that section so I'd like to suggest some changes/improvements to help with the overall credibility and thoughtfulness of the paper, especially to the "e" people in ePRO.

I have reviewed the draft manuscript and have no comments on the content or format, except to say that I found it easy to read, practical and sensible.

I liked the specific advice (even when, I am sure, others may wish to quibble over exact cut points for numeric values). 

My training and experience is in statistics, so I appreciated the clear discussion of study designs and suitable analyses.  I concur with the study designs and the descriptions of when different levels of validation should be undertaken.


It was a pleasure to read the very well-written draft manuscript. This manuscript will be very useful in giving guidance on the level of evidence needed to support modifications that are made to PROs when they are migrated from paper to ePROs.


I have 2 minor comments on the ePRO document you send earlier:

1) On Page 10, where you talk about allowing ability "choose not to respond" mentioning something about the fact that paper questionnaires typically allow flexibility to go back and change earlier responses and how an ePRO can or cannot accommodate this and what is the task force's suggestion on this.

2) On page 16 and 17 where the study designs for equivalence testing are discussed some guidance about how to calculate sample size for these studies. n per item or per domain. In my experience calculating sample size for any PRO component of a study is challenging as there is no proper guideline on this compared to clinical outcomes.


I have now had a chance to review the manuscript and attached are my comments. I have tried to make them as self explanatory as possible.

In general, I think the paper is well written, but there are several places which are not as clear as they should be. This includes Table 1. This is really the centerpiece for the paper and it seems like there are many holes still to be filled in.

In addition, I am not sure if you would be making the same recommendations regardless of the intent of the data from the ePRO. For instance, if a researcher in clinical practice was undertaking a study in her practice to use the data for individual clinical decision making, would the same recommendations apply?

Please let me know if you have any questions on my comments. And thanks for the opportunity to provide my feedback.


Overall, a most well structured, document that is both comprehensive in its coverage of the subject and its clarity of writing style…a valuable addition to the literature. The team has done a great job!!!

Some minor suggestions/comments

Page 3, Line 7: Consider inserting ‘at least’ before ‘comparable’

Page 10: Paragraph relating to satisfaction and ease of use: The paragraph generalises on the comparison of patient preference for ePRO over paper PRO. Are there any data relating to preference for different types of ePRO that could be reported here?

Page 11, Line 11: Consider replacing ‘expected’ with ‘potential’

Page 16, Line 3: Is there a reference to support this statement?

Page 16, Line 7: Consider replacing ‘substantially’ with ‘significantly’

Page 17, Last line: Delete ‘intraclass correlation coefficient’

Page 18, Line 20: Replace ‘as’ with ‘are’

Evidence Needed to Support Measurement Equivalence between Electronic and Paper-Based PRO Measures | Task Forces Index