Comparing 5 Evaluation Methods for Instructions for Use

By Amrutha Kumaran and Kimberly Nieves

Well-designed instructions for use (IFU) are critical to mitigate potential risks whilst confirming that the intended users can effectively use a medical device out of the box. Usability testing remains the standard for reliably evaluating the effectiveness of an IFU. However, there are a variety of evaluation methods that may assist manufacturers in refining their product labeling and IFU. In this article, we discuss these five approaches and considerations for selecting them based on their relative advantages: expert reviews, desktop readability tests, comprehension studies, knowledge tasks and mandatory IFU use scenarios.

Expert review

Before undertaking any usability testing, a manufacturer might engage a group of experts (e.g., consisting of designer(s), researcher(s), and clinical expert(s)) to comment on the information organization, text formatting, instructional text and figures within an IFU. In other words, they will conduct an expert review. The experts leverage domain experience and relevant standards to detail the potential strengths, weaknesses and recommendations for improvement in a comprehensive report. Expert reviews are effective at proactively and efficiently identifying aspects of the IFU design that could confuse users, enabling manufacturers to revise the IFU before investing significant time and resources in usability testing. Although expert reviews can generate a wide range of findings and uncover usability issues, the method may not be able to qualitatively judge the most frequent or difficult challenges that representative users will experience with the IFU.

Desktop readability test

A major consideration when evaluating the language of the labeling and the IFU is the intended users’ level of education and medical literacy. The IFU’s text should be written at a level and in a manner that is appropriate for the differing education and literacy levels of intended users. For example, manufacturers should not assume that lay/patient users know medical terminology. The U.S. Food and Drug Administration (FDA) recommends that the language and readability of an IFU consider users with low literacy skills, suggesting the document be written at or below the average national adult reading level. For the United States, this is currently around 7^th to 8^th grade. Desktop readability tests, such as Fry, FOG, Flesh-Kincaid and SMOG, assess the readability and/or grade level of written material based on unique formulas. For example, the SMOG (Simplified Measure of Gobbledygook) test targets words that contain three or more syllables. The results reveal key opportunities to edit and simplify instructional language, such as minimizing the number of words per sentence, multisyllabic words (e.g., removing technical language, providing explanations of medical jargon) and the total number of sentences. Although conciseness is preferable in the written content of an IFU, readability tests should be used to identify trends in writing and opportunities to simplify wording, rather than being the sole marker for readability. A true test of readability must come from the user reading the material and correctly comprehending and using the information.

Comprehension study

A more holistic way to assess readability is via a comprehension study. This is a component of a usability study in which representative participants provide their initial impressions of content within the IFU before answering a series of questions designed to test their understanding and perception of the instructions. The study moderator asks participants non-leading questions to determine the root causes of their feedback and incorrect responses. For example, a manufacturer may recruit representative participants to provide feedback on an IFU for a pen injector. After reading and providing their initial thoughts on each instructional step (e.g., pointing out a confusing term, vague instruction, etc.), participants respond to thought exercises regarding the step (e.g., “Jane experiences dizziness and nausea after correctly administering her injection dose. What should Jane do?”). Open-ended, closed-ended (yes or no) and multiple-choice questions may be used to assess whether particular aspects of the instructions are difficult to understand or potentially misleading for users. Comprehension studies are useful for evaluating whether the content and language of the instruction are appropriate for the cognitive, educational, or cultural characteristics of the intended user populations. Manufacturers might incorporate a comprehension study during formative usability testing to gain more focused feedback, specifically on the writing.

Knowledge tasks

Administering knowledge tasks during pre-validation or validation testing of the device is a way to test participants’ ability to understand and recall knowledge of the instructions related to critical tasks identified in the use-related risk analysis. The knowledge task is not intended to test the memory of the participants, but rather the clarity of instructions in the IFU that cannot be evaluated through hands-on simulated use scenarios. The study moderator asks participants non-leading questions to understand their interpretation of the instructions and eventually determine the root causes of their incorrect responses. (e.g., past experiences with a different injection device, the participant’s memory, the IFU). If the participant responds without consulting the IFU, the moderator will ask follow-up questions to assess their ability to correctly locate and interpret the instructions relating to each critical step (e.g., ideal storage temperature, intact packaging seal, colorless liquid in the drug window). While this technique is similar to the “comprehension study”, “knowledge tasks” refer to a method specifically intended to evaluate those instructions (e.g., warnings) designed to mitigate the potential for critical harm during product use.

Mandatory IFU use scenario

Finally, comparing participants’ performance during optional and mandatory IFU use scenarios during pre-validation usability studies is helpful for directly assessing the IFU’s ability to mitigate use errors and understanding how users will utilize the IFU during actual product use. For example, a manufacturer may ask test participants to complete two simulated use scenarios

The moderator instructs participants to administer a prescribed dose using the pen injector in the first use scenario.
Assuming the participant did not reference the IFU, the moderator instructs participants to administer the prescribed dose a second time while following the IFU step-by-step.

The moderator then collects participants’ feedback on the causes of any observed use errors across both use scenarios before conducting any knowledge tasks and returning to the IFU for a deeper discussion of any mistakes they made during the IFU-mandatory use scenario.

In conclusion, each evaluation method has its strengths. Expert reviews and desktop readability tests are valuable for proactively improving the information organization, text formatting and readability of instructional text and figures within an IFU before usability testing. Administering comprehension studies and knowledge tasks to participants (during formative and validation stage tests, respectively) is effective for observing how users navigate and interpret the contents of the IFU. Repeating use scenarios with mandatory IFU use is particularly useful for a realistic understanding of how users will work with the product when their attention is divided between the instructions and the tasks at hand.

Applying multiple evaluation methods for product labeling and instructional materials across the phases of product development provides manufacturers with many opportunities to improve an IFU’s design. Making these improvements early and iteratively will only help encourage safer use, reduce the cognitive burden while performing tasks, and introduce additional opportunities for mitigations in the labeling and/or device design.

Contact our team to learn more about evaluating IFUs. Or, sign up for a complimentary account with OPUS, our team’s software platform that provides human factors engineering (HFE) training, a design recommendation tool, and human factors (HF) templates.

Amrutha Kumaran is a Human Factors Specialist and. Kimberly Nieves is a Managing User Interface Designer at Emergo by UL

Learn more about human factors engineering and usability for medical devices at Emergo by UL:

Dec 2, 2024

When to Verify Risk Control Measures as Part of Human Factors Engineering Activities

Drawing of two people touching a web browser

Sep 28, 2021

Considering the Effect of Instructional Text’s Readability on Task Completion Times

Jun 20, 2022

5 Methods for Evaluating Your Instructions for Use: Benefits and Drawbacks

Expert review

Desktop readability test

Comprehension study

Knowledge tasks

Mandatory IFU use scenario

Request more information from our specialists

Thank you! We've received your request.

5 Methods for Evaluating Your Instructions for Use: Benefits and Drawbacks

Expert review

Desktop readability test

Comprehension study

Knowledge tasks

Mandatory IFU use scenario

When to Verify Risk Control Measures as Part of Human Factors Engineering Activities

Considering the Effect of Instructional Text’s Readability on Task Completion Times

Wiklund's Perspective: IFU Designers Helping Us Get Through the Pandemic

Request more information from our specialists

Thank you! We've received your request.