article

What our design team learned running remote usability tests

Design has been integrated into the product teams at our company for a couple of years now. Still, it took us a while to reach a significant milestone in our practice: testing our features with real users and learning from that to build better experiences.

Since then, usability tests have become a crucial tool in our design process. In this article we go through the seemingly daunting questions that our team had about the entire UX testing process, and what we've been learning from our recent experiences.

Why is testing interfaces so important?

Testing a design solution with actual, real-life users can confirm or dismantle the product team's ideas about the value we are delivering, and also show clearly where that value lies. A well-built test can shift the entire product strategy, saving the team from developing something that people don't care about, or that doesn't help them reach the desired outcome. And this is the kind of discovery that every product team should make sooner rather than later.

Additionally, running a UX test can reveal unexpected aspects of the users' context and the product itself. Every time we ran a usability test, we walked out of it with a lot more insights than we had expected, sometimes even related to other parts of our products.

Another significant benefit of testing our UIs is that it's more cost-effective to reshape the design solution before implementing it than it is to fix the product when it's up and running. Development costs are the obvious reason, but there's also the cognitive cost of teaching our users a better way to reach an outcome because we didn't get it right the first time.

After testing a feature, the team gets on the same page about it. Sometimes it's hard to defend the value of a particular design decision to our business and engineering counterparts because the team also needs to consider technical viability and product strategy. But when we all see a test participant struggling with an interaction, it's easier to rally the entire team to solve their problem.

Is remote testing really better than face to face?

As a software studio based in South America that caters to clients distributed across North America, testing interfaces remotely was not much of a choice for Vinta design team. At first, setting everything up and having to rely on various technological moving parts to run our tests seemed like a big challenge. But after our first experience, the benefits of remote testing started to become clear to us:

  • Testing remotely is more agile-friendly. In other words, the whole process tends to go faster, as the leading time to schedule participants is reduced dramatically. Our team can run a usability test from scratch (build the prototype, source, and schedule participants, run the test and analyze results) in just two 2-week sprints.
  • The lab setup is simpler. We don't need a one-way mirror room, or multiple cameras filming the participant's face and the screen they're using. Their device, paired with Lookback, our testing tool of choice, already does the job. Lookback also allows multiple people from the team to watch and comment on the test as it's happening,
  • Participants tend to act more naturally. A UX test with a prototype always has a component of role-playing, because the users' actions have no real-life consequences, and they might be dealing with static information and placeholder data that wouldn't exist in the real product. In our experience, having users in their regular environment, with their own device, can make the experience feel more natural and relaxed than if we invite them over to an in-person test in a "research lab" setting.
  • It's easier to source participants, not only because the geographic barrier is gone, but also because their time is less compromised. It's way more comfortable for a person to find time on their schedule for a 20-minute call (and spend a few minutes installing an app on their phone) than to physically go to a software studio or a research lab.
  • It allows for a more geographically diverse subset of participants, which can be great if we want to cross-examine cultural differences. For the particular context of web products with an international audience, a test with a diverse participant pool can help the team make sure that no parts of the interface get lost in translation.
  • It's easier to follow a protocol. Seasoned researchers might disagree with me on this. Still, I always found it easier to distance myself emotionally from participants, follow a script, and act more professional if I'm doing everything remotely. Having a screen barrier has its downsides (we get limited feedback on our participants' body language), but for our team, the pros outweigh the cons by a landslide.

There are some great articles that go deeper into the benefits and challenges of running remote UX tests. If your team has the choice between remote and in-person, and you want to get a broader perspective, I highly recommend further reading.

When is it  worth to run an UX test?

The answer most designers are probably expecting to read here is always. But at Vinta, we are die-hard pragmatists, and the fact is that running a UX test requires full dedication from the designer, and some work and available time from other stakeholders. It's a tool that should be used strategically when we need to figure out key aspects of the product.

So what we do is analyze the feature we are working on, answering the following questions. If the answer is YES for at least two of them, we know that our design solution will benefit significantly from testing.

  • Is this feature crucial for the core-business?
  • Is this feature a great technical challenge?
  • Are we proposing a dramatic change in flow and functionality?
  • Are we proposing something completely new, that our users might not be familiar with?
  • Are we proposing changes that might be a high risk for the business?

After that, we have one final question, that must be answered with full honesty:

  • Does the product designer have at least 2 sprints to build and run the test without interruptions?

If we already know that testing is essential, but the timing is not perfect, the other designers in our team make an effort to shield who will be in charge of running the test. We hold the fort for this 2-sprint period, knowing that all of us will learn a lot from that experience.

What will the test achieve for the product?

One big mistake our team made on our very first usability test was to try to validate too many aspects of the experience at once. The client required a significant change in user flow, which would have a considerable technical impact, and that high risk is what ultimately "sold" them the idea of running a test, to make sure we were on the right track. It was also our first chance to run a remote UX test, and there were lots of things we wanted to find out (both about the product and about how to run usability tests).

The problem was that without a clear focus on what the team needed to discover or validate, it became tough to narrow down the test scope. Our first test was bloated with too many tasks, a complex prototype that took too long to build, and so many different user journeys that it was nearly impossible to normalize our participants' results.

We saw these issues halfway into building the test but decided to go through with it because we needed to have this experience under our belt, and the product owner wouldn't reduce the feature's scope. And even then, we learned a lot about the product and got valuable insights from our users that we wouldn't be able to catch otherwise. In the end, a poorly-built test is better than no test at all. But the most valuable lessons we got from this first experience was to start with a solid hypothesis, reduce the prototype scope to the strictly necessary, and maintain a laser-focus throughout the experiment.

How to formulate a test hypothesis

The test hypothesis can vary a lot, depending on the feature you're working on. It will inform what kind of test you should do (moderated vs. unmoderated), what profile of participants you should search for (current users vs. prospects), and which tasks should be built in your prototype.

Hypotheses should relate to the primary purpose of the feature, while questions should answer design aspects that you have no other way to validate. Try to work on two opposing hypotheses, keeping your mind open to the fact that that the change you're proposing might end up being bad for the product.

Here's an example: our team was working on a feature to change the first-purchase experience for a healthy meal subscription plan (a B2C product), with the goals of making clearer which products were being purchased and reducing requests for product replacements after it. It was a big change in a core experience, and represented a high risk for the business. The solution we were exploring involved moving from a "choose between plans 1, 2, and 3" flow into a "build-your-box" experience.

These were the hypotheses we came up with:

  • Hypothesis A: changing the signup flow from a choice between three plans into a box that you can fill with products brings users closer to the product, increasing engagement in the purchasing process.
  • Hypothesis B: changing the signup flow from a choice between three plans into a box that you can fill with products generates decision-fatigue and causes users to drop-off before completing their purchase.

And these were the questions:

  • Is the UI easy to understand, enabling users to go through the task without a lot of back-and-forths?
  • How much time do users spend in the process?
  • Are we communicating clearly the different actions/options available?
  • Which flow are users most likely to go through?
  • Does this part of the UI skew users' decision towards one option over the other?

Keep in mind that other insights will arise during the test sessions that might not be related to your original questions. Take note of everything (ideas that were not on your radar can prove to be very useful in the future), but remember the purpose of the experiment when you're consolidating the results.

What kind of test should we run?

There are two possible ways to conduct a remote UX test: moderated and unmoderated. As NN/g defines:

Moderated sessions allow for back and forth between the participant and facilitator, because both are online simultaneously. Facilitators can ask questions for clarification or dive into issues through additional questions after tasks are completed.
Unmoderated usability sessions are completed alone by the participant. Although there is no real-time interaction with the participant, some tools for remote testing allow predefined follow-up questions to be built into the study, to be shown after each task, or at the end of the session.

This article goes into detail on how moderated and unmoderated tests work. I recommend any designer who is in the process of choosing a test technique to read their analysis. We've had the experience of running both kinds of tests in Vinta, and these are the pros and cons that we found more significant:

  • Moderated tests give us more control over the process and the ability to stimulate participants to speak their minds while performing a task. The team feels closer to the users' pain points when we're chatting with them, so this type of test is more likely to affect real change in the product. On the flip side, moderated tests require a lot more time and effort from the designer, and scheduling all the participants in a narrow timeframe can prove to be a challenging logistic puzzle.
  • Unmoderated tests are highly scalable because once the designer has set them up, they're entirely self-serve on the participants' side. Data from these tests tends to be more objective and easier to go through, and the team can digest results more quickly. However, participants hardly think aloud during this type of test, because there's no one to interact with, only a set of instructions. And if technical difficulties happen, the person is likely to drop-off rather than ask for help.

Knowing these "built-in features" of each type of test, we look at the hypothesis, how advanced our design solution is, the timeframe that we have to run the test, and the goals of the experiment. Our decision-making process works like this:

We run a moderated test if:

  • We are exploring a new concept and need to understand how users react to it.
  • We want to test a user flow still in its early stages (wireframe or lo-fi prototype).
  • We want to impact other stakeholders with opinions and insights from real users.

We run an unmoderated test if:

  • We want to measure efficiency in a repetitive task.
  • We want to know if users can understand UI components quickly and use the expected paths.
  • We want to sort content in the UI.
  • We want to know which part of the interface is more critical to key users, and some numbers might help us make an informed decision.

How do we start?

After our team knows which type of test we want to run, there's a lot of work to do: build the prototype, source and schedule participants, run the sessions (in case of moderated tests) and analyze results together with the team.

Sometimes it's hard to know where and how to start such a big assignment, so our team decided to handle UX testing in the way we do any of our larger design challenges: breaking up tasks into checklists. We've built one for moderated and one for unmoderated tests. These checklists help us remain grounded and don't lose focus on what we're trying to achieve with the usability test.

I hope that reading about our experience with remote UX testing can motivate other design teams to give it a try. Good luck and happy testing!

Huge thanks to Pedro Bacelar, Lais Varejão, and Laura Lemos for their contributions to this post.

Aline Silveira

Designer of delightful experiences, typography nerd, die-hard feminist and crazy cat lady.