Measuring Usability-Jun 28/11
The inability of users to find products, services and information is one of the biggest problems and opportunities for website designers.
Knowing users’ goals and what top tasks they attempt on your website is an essential first step in any (re)design.
Testing and improving these task experiences is the next step. On most websites a good navigation and site structure facilitates higher tasks-completion rates.
It’s so easy to mess the navigation up there is literally a hall of shame that makes fun of poor websites.
Here are three simple metrics which can be collected as part of tests using card-sorting or tree-testing in any stage of design.
Percent of users that found the content location: Ask users where in the navigation they would find a specific piece of information—like a type of product. This process is typically called tree-testing and software like Treejack help make the testing easier (Note: OptimalSort, maker of TreeJack, is a Sponsor of MeasuringUsability.com).
You can use low-fidelity wireframes, a working prototype or the existing website. Count the number of users that attempted this task and the total who successfully land on the correct page. This is the success rate. Along with this percentage report a confidence interval by entering the number successful and total attempting.
Percent of users that placed a card in a group: With closed-card sorting (where the categories are defined) count the number of times a card appears in a category and divide this by the number of users that were asked to sort. The confidence interval around this percentage will tell you the likely range of the percent of users who would place the card in the group—and also look for the content in the higher-level navigation. For example, if 49 out of 59 users placed “publications” in the “About Us” label, then the percentage and 95% confidence interval is 83% (95% CI between 71% and 91%).
You can also make sense of 0 counts—which can be helpful for settling internal debates. For example, perhaps a Vice President really thinks the Customer Support information should go under the Product Tab. If after testing 50 users, 0 people placed Customer Support in the Product group, the 95% confidence interval is 0 to 9%. This means if another several hundred users were testes, it is very unlikely that more than 9% of users would think to look under Products for this information.
One of the biggest mistakes organizations make when designing web-navigation is mirroring some internal view of the world onto the website navigation. The wrong words are used, the categories mirror departments and the hierarchy resembles an organizational chart. Adding these simple metrics and confidence intervals to your next card-sort or tree-map test will generate better decisions at any sample size.
For example, if 19 out of 32 users successfully found the product description the success rate is 59% and the 95% confidence interval is 42% and 76%. If you were able to test thousands of users you would expect the success rate to fall within this range, and closer to the middle than the edges. The interval tells us that it’s very unlikely that more than 76% of users would find this page.
This method can be used with all sample sizes. Smaller sample sizes will generate wider intervals and larger sample sizes will generate narrower intervals. The 20/20 rule of precision applies here. At a sample size of approximately 20, you’ll have a margin of error of 20%. To cut the margin of error in half you need to quadruple the sample size to 80 which would generate a margin of error of roughly 10%.
Percent of users that suggested a category label: In open-card sorting studies, users are asked to both sort cards into piles and come up with labels for these piles. You can also have a separate set of users come up with names with an existing set of cards or page-names.
Count the number of users that suggested the same label divided by the total users that attempted the exercise. This represents the proportion of users who find this name the best match for the category. For many categories you’ll likely have small percentages (e.g. 8%, 20%). Low percentages could mean that the categories are a bit nebulous in the minds of users or that there are really a few different ways of naming it. If you’ve done this a few times you can get some idea about low and high category percentages as a benchmark.
I often find having the confidence interval around each percentage will tell you where there is a real difference compared to just statistical noise. For example, if 10 out of 40 users assign the label “Company” and 8 out of 40 came up with “About Us” you have two proportions of 25% and 20% respectively. While 25% is certainly higher than 20%, the large overlap in the confidence intervals tells us that most of this difference can be explained by sampling error. If there are good reasons to use “About Us” instead of “Company,” the data suggests each category would generate similar success rates.