hyggeit
Back to Feed
Design & UX December 18, 2025 13 min read

Accessibility Testing Beyond Automation: The Manual Testing Playbook

Automated tools catch 30% of accessibility issues. The other 70% require human judgment — screen readers, keyboard flows, cognitive load assessment. Here is the manual testing protocol we use on every component.

The Automation Ceiling

Every design system team we work with has automated accessibility testing in their CI pipeline. They run axe-core on every pull request. Lighthouse scores are tracked in dashboards. Pa11y crawls staging environments nightly. And they still ship components that are unusable for people with disabilities.

This is not a tooling failure. It is a category limitation. Automated accessibility testing tools are excellent at detecting structural violations — missing alt text, insufficient color contrast, absent form labels, invalid ARIA attributes. These are pattern-matching problems, and computers are very good at pattern matching. The issue is that a large portion of real-world accessibility barriers are not structural violations. They are interaction failures, comprehension failures, and context failures.

Consider a modal dialog that traps focus correctly (automated tools will confirm the focus trap exists) but returns focus to the page body instead of the triggering element when dismissed. Or a live region that announces updates (automated tools will confirm the aria-live attribute is present) but fires so frequently that it drowns out everything else a screen reader user is trying to do. Or a form that provides error messages (automated tools will confirm they exist) but places them in a location that makes no spatial or logical sense.

The 30/70 reality: Research from the UK Government Digital Service found that automated tools detect roughly 30-40% of WCAG 2.1 Level AA issues. The W3C's own analysis suggests a similar figure. The remaining 60-70% require human judgment — understanding context, evaluating interaction sequences, and assessing whether the experience actually makes sense for the person using it.

What Automated Tools Cannot Evaluate

Logical reading order

Tools can check DOM order but cannot determine whether that order makes semantic sense for the content.

Meaningful alternative text

Tools detect missing alt attributes but cannot evaluate whether "image.png" or "photo" is actually useful.

Keyboard interaction patterns

Tools verify focusability but cannot assess whether a custom widget follows expected keyboard conventions.

Screen reader experience quality

Tools check ARIA attributes exist but cannot judge whether the announced content is coherent or overwhelming.

Cognitive load and clarity

No tool can measure whether instructions are clear, error messages are helpful, or processes are understandable.

Automated testing is necessary but not sufficient. It is the foundation, not the structure. What follows is the manual testing protocol we layer on top of automation for every component we build or audit.

Screen Reader Testing Protocol

Screen reader testing is where most teams struggle the most because it requires learning an entirely different interaction paradigm. Sighted users navigate visually — scanning layouts, recognizing patterns, clicking targets. Screen reader users navigate structurally — moving through headings, landmarks, form controls, and links. If your component does not make sense when consumed as a linear stream of announcements, it does not work.

We test with three screen readers because they behave differently and have different user bases. Skipping any one of them means skipping a significant portion of your disabled user population.

VoiceOver (macOS / iOS)

VoiceOver is the default screen reader on Apple platforms and the most common screen reader for mobile testing. It pairs with Safari on macOS and is the only option on iOS. Key behaviors to understand: VoiceOver uses a virtual cursor (the "VoiceOver cursor") that is independent of the keyboard focus. This means elements can be reachable by VoiceOver but not by keyboard, or vice versa — both are bugs.

Essential VoiceOver commands:

Cmd + F5          → Toggle VoiceOver on/off
VO + Right Arrow  → Move to next element
VO + Left Arrow   → Move to previous element
VO + U            → Open rotor (headings, links, landmarks)
VO + Space        → Activate current element
VO + Shift + Down → Enter a group/web area
VO + Shift + Up   → Exit a group/web area
Ctrl              → Stop speaking

NVDA (Windows)

NVDA is free, open-source, and the second most popular screen reader on Windows after JAWS. It uses a "browse mode" for reading web content and a "focus mode" for interacting with form controls. Understanding when NVDA switches between these modes — and when it fails to — is critical for testing interactive components.

TalkBack (Android)

TalkBack is the screen reader for Android devices. It uses a gesture-based navigation model — swipe right to move forward, swipe left to move backward, double-tap to activate. Mobile screen reader testing is essential because touch interaction introduces entirely different accessibility challenges than desktop keyboard navigation.

Screen Reader Testing Checklist

  • Can you navigate to every interactive element using the screen reader's navigation commands?
  • Is every element announced with its role, name, and state (e.g., "Submit button", "Email, edit text, required")?
  • Do headings provide a logical document outline when navigated via the heading list?
  • Are landmarks (nav, main, aside, footer) present and labeled when there are multiples?
  • Do dynamic content changes (toasts, inline validation, loading states) get announced?
  • Is the announcement concise? No duplicate or extraneous information?
  • Can you complete the entire user flow without sighted assistance?
  • Do images have alt text that conveys meaning, not just a file name?
  • Are decorative images hidden from the accessibility tree (aria-hidden or empty alt)?
  • Do grouped controls (radio buttons, tab panels) announce their group context?

Key Flows to Test

Do not test screen reader compatibility on isolated components. Test complete user flows. A button that announces correctly in isolation may produce a confusing experience when embedded in a form wizard with inline validation, error summaries, and progress indicators. We test these flows on every project:

Navigation flow

Land on the page, orient using landmarks and headings, find the primary action. Can the user build a mental model of the page structure?

Form completion flow

Fill out every field, trigger validation errors, correct them, and submit. Are errors linked to their fields? Is the error summary navigable?

Dynamic content flow

Trigger a toast notification, open a modal, load content asynchronously. Does the screen reader announce changes without losing the user's place?

Data table flow

Navigate a table with headers, sort columns, paginate results. Does the screen reader announce column and row headers in context?

Keyboard Navigation Audit

Keyboard accessibility is the most fundamental layer of interactive accessibility. If a component cannot be operated with a keyboard alone, it is broken for screen reader users, switch device users, voice control users, and power users who prefer keyboard workflows. Every interactive element must be reachable, operable, and have a visible focus indicator.

Tab Order Validation

Put your mouse away. Starting from the browser address bar, press Tab repeatedly through the entire page. The focus should move through interactive elements in a logical, predictable order that matches the visual layout. Common failures we find:

  • Focus jumps to a footer link before reaching the main content
  • Visually hidden elements receive focus (off-screen menus, collapsed accordions)
  • Custom components built with div or span are not focusable at all
  • Positive tabindex values create a chaotic navigation order
  • Focus order does not match visual order after content reflows at different viewport sizes

Rule of thumb: If you ever use tabindex with a value greater than 0, you are almost certainly doing it wrong. Use tabindex="0" to add elements to the natural tab order and tabindex="-1" to make elements programmatically focusable without adding them to the tab order. Never use tabindex="1" or higher.

Focus Trap Testing

Modal dialogs, dropdown menus, and other overlay components must trap focus — keeping Tab and Shift+Tab cycling within the component until it is dismissed. But focus traps are tricky to implement correctly. Here is what to verify:

01

Focus moves to the first focusable element when the modal opens

Not to the modal container itself, not to the close button necessarily — to the first logical focusable element.

02

Tab from the last focusable element wraps to the first

The user should never be able to Tab out of the modal into the background content.

03

Shift+Tab from the first element wraps to the last

Backward navigation must also be trapped.

04

Escape closes the modal

This is expected behavior for dialogs per the ARIA Authoring Practices Guide.

05

Background content is inert

Screen reader users should not be able to navigate to content behind the modal using screen reader commands (not just Tab).

Skip Links

Skip navigation links allow keyboard users to bypass repeated content blocks (headers, navigation menus) and jump directly to main content. They should be the first focusable element on the page, become visible on focus, and target the main content area. Test that they actually work — many implementations have broken anchor targets or fail to move focus after activation.

Roving Tabindex Patterns

Composite widgets like tab lists, toolbars, and menu bars should use the roving tabindex pattern. The group has a single tab stop. Arrow keys move focus between items within the group. This matches the interaction pattern users expect from native OS controls and prevents a toolbar with 20 buttons from requiring 20 Tab presses to pass through.

Expected keyboard behavior for common patterns:

Tab list:     Tab into → Arrow Left/Right → Tab out
Menu bar:     Tab into → Arrow Left/Right between menus
              Arrow Down to open → Arrow Up/Down within
Toolbar:      Tab into → Arrow Left/Right → Tab out
Radio group:  Tab into → Arrow Up/Down → Tab out
Tree view:    Tab into → Arrow Up/Down to move
              Arrow Right to expand → Arrow Left to collapse

Focus Management in SPAs

Single-page applications break the browser's built-in accessibility model. In a traditional multi-page website, a link click triggers a full page load. The browser announces the new page title, resets focus to the top of the document, and the screen reader user knows they have arrived somewhere new. SPAs do none of this by default. The URL changes, the content swaps, but the browser treats it as the same page — because technically it is.

This is the single most common accessibility failure we encounter in modern web applications, and it affects every SPA framework: React, Vue, Angular, Svelte, and their meta-frameworks.

Route Change Announcements

When the route changes in a SPA, you must explicitly inform assistive technology that navigation has occurred. There are two primary approaches:

Approach 1: Document title update + focus management

Update document.title on every route change and move focus to the new page's h1 or main content area. This is the most reliable approach across screen readers.

Approach 2: Live region announcement

Use an aria-live="polite" region to announce "Navigated to [page name]". Works well as a supplement but should not replace focus management.

Route change focus management (framework-agnostic):

function handleRouteChange(pageTitle) {
  // Update document title
  document.title = `${pageTitle} | My App`;

  // Find the main heading or content area
  const target = document.querySelector('h1')
    || document.querySelector('main');

  if (target) {
    // Make it focusable if it isn't already
    if (!target.hasAttribute('tabindex')) {
      target.setAttribute('tabindex', '-1');
    }
    // Move focus
    target.focus();
  }
}

Focus Restoration After Modal Close

When a modal, dialog, or popover closes, focus must return to the element that triggered it. This sounds simple, but edge cases abound. What if the trigger was inside a list item that has since been removed? What if the trigger was a delete button and the item it belonged to no longer exists? You must have a fallback strategy:

  • Store a reference to the trigger element before opening the modal
  • On close, check if the trigger still exists in the DOM
  • If it does, focus it
  • If it does not, focus the nearest logical parent or the next element in the list
  • As a last resort, focus the main content area

Managing Focus in Dynamic Content

Toast notifications, inline alerts, loading indicators, and live data updates all present focus management challenges. The key principle is: notify without disrupting. Moving focus to a toast notification is almost always wrong — it rips the user away from what they were doing. Instead, use aria-live regions with appropriate politeness levels:

Content Type aria-live Value Move Focus?
Toast / snackbar polite No
Inline form error assertive After submit only
Alert dialog (destructive action) assertive Yes — to the dialog
Loading spinner polite No (use aria-busy)
Chat message received polite No
Countdown / timer off (update periodically) No

Common mistake: Using aria-live="assertive" for non-critical notifications. Assertive announcements interrupt whatever the screen reader is currently saying. Reserve this for genuine errors and urgent alerts. Overusing it trains users to ignore all announcements from your application.

Cognitive Accessibility

Cognitive accessibility is the least tested and least understood dimension of accessibility work. WCAG addresses it through success criteria around reading level, consistent navigation, error identification, and input purpose — but cognitive accessibility extends far beyond what any specification can capture. It is about whether real people with varying cognitive abilities can actually understand and use your interface.

Reading Level Assessment

WCAG 2.1 Success Criterion 3.1.5 recommends that content be understandable at a lower secondary education reading level (roughly age 12-14). For UI text — labels, instructions, error messages, help text — we aim even simpler. Test your component text by asking: could a stressed, distracted, non-native English speaker understand this on the first reading?

Instead of: "Authentication credentials are invalid"

Write: "The email or password you entered is incorrect"

Instead of: "A transient network error has occurred"

Write: "We could not connect to the server. Please check your internet connection and try again."

Instead of: "This field is required"

Write: "Please enter your email address"

Information Density

Every additional element on screen competes for cognitive resources. Users with ADHD, anxiety disorders, or traumatic brain injuries are particularly affected by information-dense interfaces. Evaluate each component for cognitive load:

  • How many distinct pieces of information does the user need to process simultaneously?
  • Can any information be deferred to a secondary view or progressive disclosure pattern?
  • Are related items visually grouped to reduce parsing effort?
  • Is there sufficient whitespace to prevent elements from blending together?
  • Can the user predict what will happen before they take action?

Motion Sensitivity

Animations and transitions that seem subtle to most users can cause nausea, dizziness, or seizures in users with vestibular disorders or photosensitive epilepsy. This is not an edge case — approximately 35% of adults over 40 have experienced vestibular dysfunction.

Respect user motion preferences:

/* Default: animations enabled */
.component {
  transition: transform 300ms ease;
}

/* Reduced motion: remove or minimize animations */
@media (prefers-reduced-motion: reduce) {
  .component {
    transition: none;
  }
}

/* Test checklist for motion: */
/* 1. Does prefers-reduced-motion disable all non-essential animation? */
/* 2. Are parallax effects removed? */
/* 3. Are auto-playing videos/carousels paused? */
/* 4. Do page transitions use crossfade instead of slide? */
/* 5. Are loading spinners replaced with static indicators? */

Error Message Clarity

Error messages must answer three questions: What happened? Why did it happen? What should the user do now? We review every error message in the design system against these criteria. Vague messages like "An error occurred" or "Invalid input" fail all three.

Progressive Disclosure

Show users only what they need at each step. Collapse advanced options. Break long forms into multiple steps with clear progress indicators. Let users expand sections on demand rather than presenting everything at once. This reduces cognitive load for everyone and is essential for users with cognitive disabilities.

The Component Testing Matrix

Not every component needs the same testing depth. A static heading does not need keyboard testing. A complex date picker needs everything. This matrix maps component types to testing requirements, distinguishing what automation handles from what requires human evaluation.

Component Automated Checks Manual Checks
Button Role, accessible name, contrast Focus visible, click/Enter/Space all work, loading state announced, disabled state clear
Modal / Dialog Role=dialog, aria-labelledby, aria-modal Focus trap works, Escape closes, focus returns to trigger, background inert, SR announces dialog title
Form Input Label association, required attribute, error linked via aria-describedby Error message clarity, validation timing, SR announces label+hint+error together, autocomplete works
Toast / Snackbar aria-live present, role=status or role=alert Announced without stealing focus, auto-dismiss timing adequate, action button reachable, not overwhelming with frequency
Dropdown Menu Role=menu, aria-expanded, menuitem roles Arrow key navigation, Escape closes, type-ahead works, focus management on open/close
Tabs Role=tablist/tab/tabpanel, aria-selected, aria-controls Arrow keys switch tabs, Tab moves to panel content, active tab visually and programmatically indicated, panel content announced
Carousel aria-roledescription, aria-label on slides Pause button works, auto-advance respects reduced-motion, keyboard navigable, slide count announced, focus never lost between slides

Prioritization guidance: If your team has limited testing capacity, prioritize manual testing for modals, forms, and any component with dynamic state changes. These are where automation misses the most and where user impact is highest.

Testing with Real Users

WCAG compliance is necessary but not sufficient for actual accessibility. You can build a component that passes every WCAG success criterion and still creates a frustrating, unusable experience for disabled users. Specifications define minimum technical requirements; usability testing reveals whether those requirements translate into a workable experience.

We have seen date pickers that are technically keyboard-accessible but so cumbersome to operate that screen reader users abandon the task. We have seen forms that meet every ARIA specification but present information in an order that makes no sense when consumed linearly. Compliance testing tells you whether you followed the rules. User testing tells you whether the rules were enough.

Recruiting Participants

Recruiting disabled participants requires intentionality and respect. Some practical approaches:

  • Partner with disability organizations (NFB, ACB, RNIB) who maintain tester networks
  • Use accessibility-focused research platforms like Fable or Access Works
  • Include disability in your general user research recruitment criteria rather than treating it as a separate study
  • Compensate participants fairly — rates should reflect the specialized expertise they bring
  • Allow participants to use their own devices and assistive technology, not lab equipment they are unfamiliar with

Session Structure

Accessibility user testing sessions require some adjustments from standard usability testing:

Allow more time

Sessions should be 75-90 minutes instead of the standard 60. Users may need time to configure their assistive technology or take breaks.

Share materials in advance

Send task descriptions and any relevant context in accessible formats before the session. This lets participants prepare and reduces anxiety.

Record screen reader audio

Capture what the screen reader announces alongside screen recording. This is essential for analysis — you cannot debug screen reader issues from visual recordings alone.

Focus on tasks, not features

Ask participants to "book an appointment" not "use the date picker." You are testing outcomes, not components.

Common Findings That Specs Miss

Across dozens of accessibility user testing sessions, these are the issues that consistently surface that no automated tool or manual WCAG audit would catch:

Screen reader verbosity

Components that announce too much information at once, overwhelming the user. A card component that announces "link, heading level 3, article title, image decorative, published January 15, read more link" all in one stream.

Mental model mismatches

A tab panel that visually suggests tabbed content but is coded as an accordion — or vice versa. The screen reader announces one pattern while the visual design suggests another.

Workaround strategies

Users develop workarounds for poorly accessible components — refreshing the page to reset focus, using the browser's find function instead of the site's search, copying content to a text editor to read it. These signal design failures.

Building a Testing Culture

Manual accessibility testing is only sustainable when it is embedded in your development process, not bolted on as a final gate. Teams that treat accessibility as a pre-release checklist consistently fail to maintain it. The findings come too late, the fixes are too expensive, and the testing itself becomes associated with delay and friction.

Integrating into Sprint Workflows

Accessibility testing should happen at three points in the development cycle:

01

Design review

Before implementation begins, review designs for keyboard interaction patterns, focus order, ARIA landmark structure, and content hierarchy. Catching issues here costs minutes, not days.

02

PR review

Every pull request that touches UI should include a keyboard walkthrough and a quick screen reader check. Build this into code review expectations alongside test coverage and code style.

03

Sprint demo

During sprint demos, occasionally demonstrate features using only a keyboard or screen reader. This normalizes accessibility testing and helps non-engineering stakeholders understand what accessibility means in practice.

Accessibility Champions Program

Centralized accessibility experts create bottlenecks. Instead, train accessibility champions within each team. These are developers, designers, or QA engineers who receive deeper accessibility training and serve as the first line of defense. They are not solely responsible for accessibility — that responsibility belongs to the entire team — but they have the expertise to catch issues early and mentor others.

  • One champion per squad or feature team
  • Monthly knowledge-sharing sessions across champions
  • Champions participate in design reviews and provide accessibility guidance
  • Champions have allocated time (10-15% of sprint) for accessibility work
  • Champions maintain the team's accessibility testing documentation and checklists

Definition of Done

If accessibility is not in your definition of done, it will be treated as optional. Add explicit criteria:

Example Definition of Done (accessibility criteria):

[ ] Automated tests pass (axe-core, no new violations)
[ ] Keyboard walkthrough completed (all interactions work)
[ ] Screen reader tested with VoiceOver or NVDA
[ ] Focus management verified (modals, route changes)
[ ] Color contrast verified for all states (hover, focus, active, disabled)
[ ] Content reviewed for reading level and clarity
[ ] Zoom tested at 200% (no horizontal scroll, no overlap)
[ ] Touch targets meet 44x44px minimum

Quarterly Audit Cadence

Even with strong sprint-level testing, conduct a quarterly comprehensive audit. This catches issues that emerge from component interactions, tests complete user flows rather than individual components, and provides a baseline for tracking accessibility progress over time. Each quarterly audit should include both automated scanning and manual testing of the top 10 user flows, with findings tracked in the same backlog as feature work — not in a separate accessibility backlog that nobody prioritizes.

Our Per-Component Checklist

This is the exact checklist we run against every component in a design system. It is not exhaustive for every possible context, but it covers the critical dimensions that we have found catch the vast majority of accessibility issues. We evaluate each item as Pass, Fail, or Not Applicable.

01

Semantic HTML

Is the component built with the most appropriate HTML element? Buttons are <button>, not <div onClick>. Links are <a> with href. Lists use <ul>/<ol>. Tables use <table> with <th> headers. Headings follow a logical hierarchy without skipping levels.

02

ARIA Usage

Are ARIA attributes used correctly and only when necessary? First rule of ARIA: do not use ARIA if a native HTML element provides the semantics you need. When ARIA is required, verify roles, states, and properties match the ARIA Authoring Practices Guide pattern. No ARIA is better than bad ARIA.

03

Keyboard Operability

Can every function of the component be operated using only a keyboard? Tab/Shift+Tab for navigation, Enter/Space for activation, Arrow keys for composite widgets, Escape for dismissal. No keyboard traps — the user can always Tab away.

04

Screen Reader Compatibility

Does the component announce its role, name, and state correctly? Test with at least VoiceOver+Safari and NVDA+Chrome. Verify the component is usable — not just technically announced but practically navigable in the context of a real workflow.

05

High Contrast Mode

Does the component remain usable in Windows High Contrast Mode (now called Contrast Themes)? Borders, focus indicators, and icons that rely on color may disappear. Test with the "forced-colors: active" media query. Ensure critical visual indicators use borders or outlines, not just background colors.

06

Zoom at 200%

Does the component function correctly when the browser is zoomed to 200%? No content should be cut off, overlap, or require horizontal scrolling on a 1280px viewport. Text must reflow. Fixed-size containers that clip content are a common failure point.

07

Motion and Animation

Does the component respect prefers-reduced-motion? All non-essential animations should be removed or reduced when this preference is active. Auto-playing content (carousels, videos, animated illustrations) must be pausable. No content flashes more than three times per second.

08

Touch Target Size

Do all interactive elements meet a minimum touch target size of 44x44 CSS pixels (WCAG) or 48x48dp (Material Design recommendation)? This applies to buttons, links, checkboxes, radio buttons, and any other tappable element. Inline links in body text are exempt but should still have generous padding.

09

Error Handling

Are error states clearly communicated through text (not just color), programmatically associated with the relevant input via aria-describedby, announced by screen readers, and actionable? Does the error message explain what went wrong and how to fix it? Is there an error summary for forms with multiple errors?

How we use this checklist: Every component in the design system has a completed checklist stored alongside its documentation. When a component is updated, the relevant checklist items are re-evaluated. This creates a living record of accessibility status and makes it immediately clear when a change introduces a regression.

Accessibility testing is a skill that improves with practice. The first time you test with a screen reader, it will feel foreign and slow. Within a few weeks of regular practice, you will develop fluency — and you will start noticing accessibility issues instinctively, before you even begin formal testing. That instinct, built through repeated manual testing, is something no automated tool can replicate.

Get help

Need an accessibility audit for your design system?

We run the full manual testing protocol — screen readers, keyboard flows, cognitive assessment, and user testing — against every component in your system. Let's find the issues automation misses.

Get in Touch