Make the experience as real as possible
As a general rule, you’ll get more detailed feedback the more detailed and functional your prototype. But you don't always need detailed feedback.
If you’re looking for high-level feedback on a concept or idea for example — something like, “This is great” or “This seems unnecessary” — then you can get away with a minimally functioning prototype, like a paper prototype or clickable wireframes made in Sketch or Figma (referred to as a low- or mid-fidelity prototype).
On the other hand, if you need to test a complex feature or a bunch of connected screens that would be really difficult to simulate with simple sketches or wireframes, you'll have to increase the functionality and detail. In other words, you’d need to create a prototype with a higher level of fidelity.
What is design fidelity?
Design fidelity refers to the level of functionality and detail of a prototype, which can be categorized as either low, mid, or high-fidelity.
- Low-fidelity prototypes may include only sticky notes, hand-drawn sketches, or very simple black-and-white wireframes.
- Mid-fidelity prototypes consist of more detailed wireframes with basic functionality, such as the ability to click on links or buttons.
- High-fidelity prototypes are closest to the finished product in appearance and functionality. They may include coded prototypes, if you feel it’s the only way to simulate reality and get the answers you need.
Achieving the right level of fidelity is a balancing act and depends heavily upon how much detail you need in your feedback to continue moving forward with confidence. The trick then is faking reality just enough so the user buys it, while putting in the minimum amount of effort to create the prototype as possible. As tempting as it may be to present your testers with something beautiful and polished, building something super detailed too early on would require you to make a bunch of assumptions and waste a bunch of time, which is — ultimately — what user testing serves to avoid.
Fortunately, we’ve found users are pretty great at playing pretend. Of course, if your prototype — no matter the level of fidelity — is full of bugs or dead ends, it will pull the user out of the experience and ruin the test.
An easy way to avoid this is to run through your test yourself, start to finish, as well as to recruit a colleague to test the test ahead of time to make sure any rough spots are smoothed out.
Set the stage with a realistic scenario
Giving users a realistic scenario provides the context they need to successfully perform a task. This is when you’ll want to consider introducing specific constraints that will help frame their thinking.
For example, let’s say you’re doing a test for Airbnb, and you’d like to improve the user experience for guests booking last-minute vacations.
In practice that looks like this:
“Imagine you’re trying to book a trip under a tight timeline. How might you go about getting a house booked in New York for next weekend?”
In the scenario above, the constraints include: booking type (house), location (New York), timeframe (next weekend).
Write your script
By this point you’ve already defined what you want to understand and created a realistic scenario. Use these details to guide your script.
For example, if you want to find out “if users are able to create an account successfully,” make sure there’s a question in your script for that. Remember: Each statement should have an accompanying question in the script. If we use the example above, we’d want to make sure there was a question like, “How do you think you would create an account?”
As you create the rest of your script, try to imagine what your user will be experiencing and craft your questions accordingly. A good exercise for this is to walk through your prototype step-by-step. The point here is to anticipate what your participants might find difficult so you can ensure you have questions prepared to tease out why they’re finding it difficult.
Thinking about this in advance will give you confidence going into the session, which puts your participants at ease. Not only that, you’ll ask better follow-up questions, keep things on track, and recover more easily if things go way off the rails
You don’t need to be a journalist to come up with great follow-up questions, you just need to be curious. For example, when a participant says they like or don’t like something, ask them:
- What don’t you/do you like about it?
- How might you improve it?
If a participant is surprised by something, ask them:
- Why were you surprised?
- What did you expect to happen?
At the start of a new flow or feature, ask participants:
- What do you think this is?
- What are your initial impressions, thoughts, and feelings?
If a participant breezes through something important, stop them and ask:
- Did you notice it? Why not?
- Do you care?
Once a user has completed a flow, ask them:
- What were your overall impressions?
- Did it match your impression of it before you started?
- Was it valuable?
- What did you like?
- What did you dislike?
Don’t forget the ‘Software Usability Scale’
Developed in 1986, the Software Usability Scale (SUS) is a reliable tool for measuring the usability of your product.
When the SUS is used, participants are asked to score the following 10 items on a scale of one to five, with ‘Strongly Disagree’ (1) on one end and ‘Strongly Agree’ (5) on the other.
- I think that I would like to use this system frequently.
- I found the system unnecessarily complex.
- I thought the system was easy to use.
- I think that I would need the support of a technical person to be able to use this system.
- I found the various functions in this system were well integrated.
- I thought there was too much inconsistency in this system.
- I would imagine that most people would learn to use this system very quickly.
- I found the system very cumbersome to use.
- I felt very confident using the system.
- I needed to learn a lot of things before I could get going with this system.
Including the SUS in every usability test has the following benefits:
- Provides your team with a baseline usability score. When you’ve completed all your user tests, either add all the scores together for an overall score, or find the average.
- Allows you to compare scores between tests. Did one user score something higher than another user? This may provide additional insights regarding the usability of your product with respect to different personas.
- Gives you leverage when convincing stakeholders where you invest time and resources. It also lets you easily demonstrate to stakeholders how your product has improved as you do more and more tests down the line.
In terms of when to ask these questions, we suggest at the end of every script.
Test the test
One of the best pieces of advice we can offer is to do a couple dry runs of the test on anyone you can find, whether a coworker or a friend. Don’t worry about getting someone who matches your perfect user — dry runs are for making sure your script makes sense and nothing feels awkward or confusing.
If your real participants are participating remotely, a dry run will also give you the opportunity to see if it makes sense without you sitting right next to them.
Finally, use the dry run to time your test. We find between 1 and 1.5 hours to be the sweet spot for getting value while keeping your participant engaged — any longer and everyone in the room starts to get tired. Long tests also create a mountain of recordings to parse after the fact, which can be time consuming and repetitive.
Modify as you go
You may discover a problem very early on in your tests and wonder if it’s necessary to watch all five of your testers struggle with the same thing. It’s absolutely not, and here’s why: These aren't scientific tests, but rather opportunities to observe real users interacting with a product you're intimately familiar with.
Whereas a true experiment would require a scientific hypothesis which you either prove or disprove with clean data, user tests are a structured way to discover if something sucks — and sometimes you only need to see it once to know it's worth changing.
Plus, making tweaks on the fly allows you to test things faster and uncover deeper insights. In fact, some of the best insights we’ve gotten were the result of subtle changes we made over the course of several tests.
Here’s how that might look in practice:
Let’s say your first two testers really struggle with finding the "My Account” page. Rather than watching three more people struggle for the sake of data, you could instead choose to fix the problem between tests — by changing the colour of the button, for example — and then ask the next user to perform the same task using the new flow. Should the problem be resolved you’re then free to address the next problem and then the next.
We've often done 3-4 changes between tests, resolving issues from previous tests and testing the new iterations. This often gets us much further than painstakingly adhering to the original test.
Some things you might consider tweaking in between tests include:
- changes to the test itself
- asking a different question
- posing a scenario differently
- making changes to the UI to fix obvious issues from early tests that hung people up
Of course, there are situations in which you might not want to change the test. Like, for example, if you’re trying to persuade a stakeholder with data (e.g., “We need to allocate resources to fixing XYZ because testing shows five out of five people struggled with it.”).
Also, if you identify a problem, but the solution doesn’t seem obvious or it’s obvious but it would require big changes — don’t change the test. Instead, let your testers run the test until you have enough data and you’re confident you know what the problem is and why it exists. Only then will the obvious solution be revealed. (We know, it’s cheesy — but it’s also true.)
We've tested entire cohorts where we change nothing, because we don't have a deep enough understanding of the problem, or we need to demonstrate to a stakeholder just how deep the problem runs. We've also done tests where we change lots of stuff, and it proved the solution in one cohort. In short, use your judgement, and treat user testing as an adaptable tool versus a set of rules.
Tools of the trade
- Screen recording software. If you have a preferred screen recording tool, great. If not, try out Quicktime screen recording — it’s our favourite.
- Mobile app recording setup. If you’re testing a mobile app, we love using Mr.Tappy. Using a tool like this also allows users to pick up the device, making the test more realistic.
- Ensure the thing you’re testing is in good working order
- Write your script
- Create a scenario for your participants
- Do at least one (but ideally two or more) dry runs, including a remote dry run if necessary
- Tweak your test based on feedback from the dry run
- Test your tech (screen recording software, audio, mobile recording setup, etc.)