What You’ll Learn

  • How to test WebAuthn (passkey) authentication in CI environments
  • Automating OTP email retrieval with Mailpit API
  • Preventing email race conditions in parallel E2E tests
  • Locale-specific testing for multilingual UIs

Introduction

In Part 1, I introduced the overall architecture and automation strategy for “Saru,” a multi-tenant SaaS platform. This article dives deeper into the E2E testing implementation that forms the core of that automation.

The most challenging aspect is testing authentication flows. Saru uses two authentication methods:

PortalAuth MethodChallenge
System / ProviderOTP + PasskeyEmail retrieval, WebAuthn
Reseller / ConsumerKeycloak OAuthExternal IdP integration

This article explains how to automate testing all of these in CI.

1. WebAuthn Virtual Authenticator: Testing Passkeys in CI

The Challenge with Passkey Authentication

WebAuthn (passkeys) typically requires physical security keys or biometric authentication. At first glance, testing this in CI seems impossible.

Solution: Chrome DevTools Protocol (CDP) Virtual Authenticator

Playwright allows you to create virtual authenticators through CDP. This enables testing the full WebAuthn flow without physical devices.

Note: CDP virtual authenticators are Chromium-only. They don’t work with Safari (WebKit) or Firefox. For cross-browser testing, run WebAuthn tests only on Chromium and mock authenticated state for other browsers.

Implementation Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import { test, expect, type BrowserContext } from '@playwright/test';

test('should complete signup with Passkey registration', async ({ page, context }) => {
  // Enable virtual authenticator
  const cdpSession = await context.newCDPSession(page);
  await cdpSession.send('WebAuthn.enable');

  // Add virtual authenticator
  await cdpSession.send('WebAuthn.addVirtualAuthenticator', {
    options: {
      protocol: 'ctap2',           // CTAP2 protocol
      transport: 'usb',            // Emulate USB connection
      hasResidentKey: true,        // Passkey capable
      hasUserVerification: true,   // Emulate biometric auth
      isUserVerified: true,        // Always succeed verification
      automaticPresenceSimulation: true, // Auto-respond
    },
  });

  // ... Execute signup flow ...

  // Click Passkey registration button
  await page.getByRole('button', { name: 'Passkey' }).click();

  // Virtual authenticator responds automatically
  await expect(page.getByText('Passkey registered')).toBeVisible();

  // Cleanup
  await cdpSession.send('WebAuthn.disable');
});

Alignment Between Transport Settings and Server Configuration

When setting up the WebAuthn virtual authenticator, alignment with server-side settings is crucial.

In Saru’s case, the backend generates WebAuthn registration options with AuthenticatorAttachment: CrossPlatform. This setting “prefers roaming authenticators (USB keys, etc.).”

Initially, I used transport: 'internal' (platform authenticator), which caused registration to fail.

1
2
3
// When server prefers CrossPlatform, alignment matters
transport: 'internal',  // Platform authenticator → may fail
transport: 'usb',       // Roaming authenticator → aligns with server

Key Point: The virtual authenticator’s transport setting needs to align with the server’s AuthenticatorAttachment setting. If registration fails, check the server configuration first. While WebAuthn spec doesn’t require exact 1:1 correspondence, misalignment is a common cause of failures.

2. Automating OTP Email Retrieval: Mailpit API Integration

Problems with Traditional Approaches

Many E2E tests retrieve OTP from a test endpoint:

1
2
3
// Get OTP via test mode (not recommended)
const response = await request.get(`/signup/${sessionId}/test/otp`);
const { otp } = await response.json();

Problems:

  • Adds TEST_MODE branches to production code
  • Doesn’t test actual email sending
  • Diverges from real user flows

Solution: Mailpit API

Saru uses Mailpit (development mail server) API to extract OTP from actually sent emails.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
const MAILPIT_API_URL = 'http://localhost:8025/api/v1';

export async function waitForOtpEmail(
  email: string,
  type: 'login' | 'signup',
  maxAttempts = 30,
  sentAfter?: string
): Promise<string | null> {
  const subjectPatterns = {
    login: ['ログインコード', 'Login Code'],
    signup: ['認証コード', 'Verification Code'],
  };

  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    await new Promise(resolve => setTimeout(resolve, 1000));

    const response = await fetch(`${MAILPIT_API_URL}/messages`);
    const data = await response.json();

    // Search for email
    const otpEmail = data.messages.find(msg => {
      // Check recipient
      if (!msg.To.some(to => to.Address === email)) return false;
      // Check subject pattern
      if (!subjectPatterns[type].some(p => msg.Subject.includes(p))) return false;
      // Timestamp filter (explained below)
      if (sentAfter && new Date(msg.Created) < new Date(sentAfter)) return false;
      return true;
    });

    if (otpEmail) {
      // Extract 6-digit OTP
      const match = otpEmail.Snippet.match(/(\d{6})/);
      if (match) return match[1];
    }
  }
  return null;
}

Key Points:

  • Tests actual email sending flow
  • Supports both Japanese/English subject patterns
  • Polls for up to 30 seconds (handles SMTP cold start delays)

3. Preventing Email Race Conditions in Parallel Tests

The Problem: OTP Mix-ups in Parallel Execution

When running multiple tests in parallel in CI, tests may accidentally retrieve another test’s OTP.

For example:

  1. Test A: Sends OTP to user-a@example.com
  2. Test B: Sends OTP to user-b@example.com
  3. Test A: Searches Mailpit → Gets Test B’s OTP

Solution: Timestamp + Unique Address Filtering

Saru combines two methods to prevent race conditions:

  1. Unique email addresses: Each test uses a different email address
  2. Timestamp filtering: Record time before OTP request, search only emails after that time
1
2
3
4
5
6
7
8
9
// Record timestamp before login
const sentAfter = new Date().toISOString();

// Submit email address (unique per test)
await page.fill('input[type="email"]', email);
await page.getByRole('button', { name: 'Login' }).click();

// Get OTP with timestamp filtering
const otp = await waitForOtpEmail(email, 'login', 30, sentAfter);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Filtering inside waitForOtpEmail
const otpEmail = data.messages.find(msg => {
  // Recipient check (narrow down by unique address)
  if (!msg.To.some(to => to.Address === email)) return false;

  // Timestamp filter (exclude old emails)
  if (sentAfter) {
    const emailTime = new Date(msg.Created).getTime();
    const filterTime = new Date(sentAfter).getTime();
    if (emailTime < filterTime) return false;
  }
  return true;
});

Alternative Approaches for Parallel Testing

More robust methods to consider:

MethodProsCons
Unique address + timestamp (Saru’s approach)Simple, no backend changesVulnerable to clock skew
Embed X-Request-ID in emailUniquely identifies emailRequires backend changes
Mailpit Search APIDirect filtering by conditionsDepends on API features

Saru’s approach prioritizes “simple and works well enough.”

Deprecated: clearMailpit()

Previously, clearMailpit() deleted all emails before each test, but in parallel execution this deletes other tests’ emails too. Timestamp filtering made this function deprecated.

1
2
3
4
5
6
7
/**
 * @deprecated Use timestamp-based filtering instead.
 * This function causes race conditions in parallel tests.
 */
export async function clearMailpit(): Promise<void> {
  await fetch(`${MAILPIT_API_URL}/messages`, { method: 'DELETE' });
}

4. Appendix: Locale-Specific Testing

Not directly related to authentication testing, but a useful technique for E2E testing multilingual apps.

The Challenge with Multilingual E2E

Common approach:

1
2
3
4
// Regex to support multiple languages (old approach)
await expect(page.getByRole('button', {
  name: /(ログイン|Login|登录)/
})).toBeVisible();

Problem: Regex must be updated every time a language is added.

Locale-Specific Testing Pattern

In Saru, we fix the language at test time and directly verify that language’s text.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// e2e/utils/locale.ts
export async function setLocale(
  context: BrowserContext,
  locale: 'ja' | 'en'
): Promise<void> {
  // Note: domain: 'localhost' may behave differently across browsers
  // Consider using url option if issues arise
  await context.addCookies([{
    name: 'locale',
    value: locale,
    domain: 'localhost',
    path: '/',
  }]);
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// Test file
const TEXT = {
  LOGIN: 'ログイン',
  PRODUCT_NAME: '商品名',
  CREATE: '作成',
} as const;

test.beforeEach(async ({ context }) => {
  await setLocale(context, 'ja');
});

test('should create a product', async ({ page }) => {
  await page.getByLabel(TEXT.PRODUCT_NAME).fill('テスト商品');
  await page.getByRole('button', { name: TEXT.CREATE }).click();
});

Benefits: Text is explicit and readable; impact scope is clear when adding languages.

5. CI Configuration: Parallel Execution on Self-hosted Runners

Matrix Strategy for Parallelization

GitHub Actions uses matrix for parallel execution by portal.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# .github/workflows/e2e-tests.yml
jobs:
  e2e:
    runs-on: [self-hosted, linux, x64]
    strategy:
      fail-fast: false
      matrix:
        portal:
          - name: system
            tests: "e2e/system-*.spec.ts e2e/system-portal/*.spec.ts"
            api_port: 8080
          - name: provider
            tests: "e2e/provider-portal/*.spec.ts"
            api_port: 8081
          # ... other portals

Separating Cross-Portal Tests

Tests spanning multiple portals (e.g., Provider→Reseller integration) run in a separate job.

Reasons:

  • Tests logging in as the same user compete
  • OTP retrieval timing overlaps
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
e2e-cross-portal:
  needs: [db-setup, e2e]  # Run after other E2E tests complete
  runs-on: [self-hosted, linux, x64]
  steps:
    - name: Run cross-portal tests
      run: |
        pnpm exec playwright test \
          e2e/auth.spec.ts \
          e2e/dashboard.spec.ts \
          e2e/search-filters.spec.ts

6. Running Cross-Portal Tests Locally

Since it takes 15-20 minutes to reach cross-portal tests in CI, we have scripts for local verification first.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Run all cross-portal tests
./scripts/run-e2e-cross-portal.sh

# Smoke tests only
./scripts/run-e2e-cross-portal.sh smoke

# Run with visible browser
./scripts/run-e2e-cross-portal.sh --headed

# Playwright UI mode
./scripts/run-e2e-cross-portal.sh --ui

Summary

ChallengeSolutionConstraints/Notes
Testing WebAuthn authenticationCDP virtual authenticatorChromium only
OTP email retrievalMailpit API integrationRequires polling
Email race conditions in parallel testsUnique address + timestampWatch for clock skew
Multilingual UI testingLocale-specific testingCookie setup dependent
CI execution timeMatrix parallelization + cross-portal separationComplex job design

With these mechanisms, Saru’s main authentication flows are automated in CI. Production-specific issues (external IdP outages, browser update behavior changes, etc.) still require manual verification, but manual testing in the daily development cycle has been significantly reduced.


Series Articles