Skip to main content
  • Insights

How Can We Define Successful Task Completion Within a Usability Test?

Suggestions for defining task completion and measuring completion rates in medical device usability testing

Residual Risk Analysis

March 16, 2023

By Charlotte Wickham

As part of a usability engineering process, the IEC 62366 standard on medical device usability states that a summative (i.e., validation) usability test should evaluate participants’ ability to successfully complete tasks (amongst other things). However, the definition of task completion is not necessarily clear-cut. This is especially true when considering participants who have encountered difficulties during the task, who made mistakes but then corrected them (i.e., a close call) or who required moderator assistance in order to complete a task.

So, how can we define successful task completion and measure task completion rates? There are a few things to consider. We might record the number of participants who completed the overall task without any findings at all (e.g., use errors, assists, close calls, difficulties). However, it might be more suitable to define successful task completion as an instance in which a participant completes a task according to the intended use and without encountering a critical finding.  

Complete use of a device may not equal correct use of a device. In some cases, a user can complete a task if you consider the overall use scenario or overall task, but they may not complete the task in a “correct” manner (i.e., according to the intended use). If we consider an injection pen as an example, the overall task (typically called a use scenario) might be to administer an injection. A user might succeed in “completing” the overall task if they administer the required dose. However, they might not perform the steps according to the intended use (e.g., they might not disinfect the injection site, perform a medication flow check, or dispose of the needle in a sharps container). In this sense, while the user has technically completed the overall task, by omitting certain steps they have introduced certain harms (e.g., infection, underdose, needlestick injury), and their process was not aligned with the intended use.

In order to measure successful task completion, it is necessary to break down broader use scenarios into individual tasks. Example tasks might include disinfect injection site, press confirm, insert needle. It is then necessary to define an acceptance criterion for each task (which is, in a sense, a “completeness” criterion for that step).

In order to come up with a measure for successful task completion, we recommend looking at the number of participants who were able to complete the overall use scenario by successfully completing all of the individual tasks. In particular, we recommend focusing on the critical tasks.

The distinction between critical and non-critical task is important here.

Some non-critical tasks are recommended for the user’s comfort, and the omission of a step would not affect the user’s ability to safely complete the task (e.g., remove refrigerated medication from the refrigerator 30 minutes prior to the injection in order to reduce injection discomfort).

A critical task is one considered imperative to safe use of a device. The tasks that are considered critical versus non-critical will depend on the device’s use-related risk analysis (typically this includes items with a severity of harm of “serious” or higher). In general, if a task is imperative for task completion or for the mitigation of significant risk, it will be considered critical.

It is important to consider finding classification. Let’s consider two types of findings in particular: an instance in which a moderator assists a participant, and an instance in which a user makes a mistake but then corrects it.

First, if the moderator is required to intervene and assist the participant with a task, you should assume that the participant would have been unable to complete the task independently. Because of this, all assists are considered critical findings, regardless of whether the assist occurred on a critical or non-critical task. For example, opening the packaging for a pen injector may not be considered a critical task because it does not carry significant risks. However, if the participant requires assistance to complete this step, it would be considered a critical finding. This is because the participant might have been unable to complete the rest of the tasks without that assistance.

Second, let’s consider close calls, i.e., an instance in which a participant makes a mistake and then corrects it would be considered a close call. A participant can encounter a close call on a particular step and still go on to complete the task according to the intended use and without introducing significant risks. As such, a close call might not be considered a critical finding. The same kind of approach can be applied to operational difficulties. A participant might experience difficulty with a certain step, resolve their difficulty, and still go on to complete the task. As such, the difficulty might not be considered a critical finding.

In order to record success rates for a particular task, you can record the number of participants who completed the task without critical findings. As discussed, a critical finding might be:

  • A finding related to a critical task (i.e., task that is imperative for safety or task completion)
  • An instance of moderator assistance on a non-critical task (because full task completion could not be guaranteed).

If a participant only encountered non-critical findings (such as a close call, or a difficulty that did not impact overall use), you would count them as someone who successfully completed the task.

Charlotte Wickham is Human Factors Specialist at Emergo by UL's Human Factors Research & Design division.