Skip to main content
Quality Telemetry Platform hero image

Thirteen frameworks. One verdict: ship or don't.

DevTools
Quality Telemetry Platform

Thirteen frameworks. One verdict: ship or don't.

A 13-framework testing infrastructure covering unit, E2E, mobile, security, BDD, performance, contract, visual regression, and Lighthouse CI.

13
Frameworks
< 8 min
CI Run Time
0
Prod Regressions Missed
90+
Lighthouse Score
Problem

The challenge

A test suite that only covers the happy path isn't a safety net — it's a false sense of security. Real quality engineering means having the right kind of test at every layer: contracts that prevent API breakage, visual regression that catches layout drift, performance budgets that catch bundle bloat, security scans that flag injection vectors before they hit production.

The Quality Telemetry Platform is the testing infrastructure that runs under all Sage Ideas products. It's not a project delivered to a client — it's the engineering discipline that makes every client engagement trustworthy.

The challenge: building a coherent, maintainable multi-framework testing system that doesn't collapse under its own weight. The risk with "13 frameworks" is that it becomes an unmaintained museum. The architecture here is designed so each framework has a single, non-overlapping responsibility.

Approach

How we built it

Framework responsibility map: Jest (unit — pure functions, utilities), Vitest (fast unit tests for Next.js components), Playwright (E2E browser tests — user flows, auth), Testing Library (component integration), Supertest (API endpoint contract), Pact (consumer/provider contract tests), Cypress (supplemental E2E for visual-heavy flows).

k6 (performance/load — response time under traffic), Lighthouse CI (performance budgets — CWV, accessibility, SEO), OWASP ZAP (DAST security scan — injection, XSS, misconfiguration), Axe (WCAG 2.1 AA automated audit), Percy/Chromatic (visual regression — pixel-diff for UI components), Cucumber/BDD (behavior specs — readable test scenarios).

The architecture principle: each framework owns a layer. Tests don't duplicate each other's coverage. If a bug can be caught by a unit test, it never reaches the E2E layer. This makes the suite fast, focused, and maintainable.

Architecture

System map

How the pieces talk to each other.

Quality Telemetry CI MatrixA pull request triggers a CI matrix of 13 frameworks (unit, E2E, mobile, security, BDD, performance, contract, visual, lighthouse, and more), feeding a status gate that controls merge.Pull Requestmerge candidateCI MatrixGitHub ActionsUnit · VitestE2E · PlaywrightMobile · DetoxSecurity · OWASP ZAPBDD · CucumberPerf · k6Contract · PactVisual · ChromaticLighthouseA11y · axeMutation · StrykerType · tscLint · ESLintStatus Gate13/13 must passMergegreen onlytriggerall greenJOB DISPATCH13 FRAMEWORKSMERGE GATE
Built UI

Selected screens

Real product surfaces from the engagement — not stock illustrations.

Grafana SLO board with p95 latency 124ms and error rate 0.04%
1 / 2

SLO board — p95 latency 124ms, error rate at 0.04%, weekly burn-rate alerts wired.

Evidence

What it actually looks like

Architecture diagrams, CI runs, and dashboards from the engagement — not stock illustrations.

ReportPlaywright
End-to-end coverage across critical journeys. The report is the deliverable — stakeholders see exactly what was tested, what passed, and what got skipped.
End-to-end coverage across critical journeys. The report is the deliverable — stakeholders see exactly what was tested, what passed, and what got skipped.
DashboardAllure
Aggregated test results across thirteen frameworks in one place. Trends over time, flake detection, and a single pane of glass for ship-or-don’t.
Aggregated test results across thirteen frameworks in one place. Trends over time, flake detection, and a single pane of glass for ship-or-don’t.
DashboardLighthouse CI
Performance + accessibility budgets enforced per PR. If a change drops the score below threshold, the build fails. The website never silently gets slower.
Performance + accessibility budgets enforced per PR. If a change drops the score below threshold, the build fails. The website never silently gets slower.
Build

What shipped

13 configured, actively maintained framework integrations. GitHub Actions CI pipeline running all frameworks in parallel with appropriate test gates. Lighthouse CI budget configuration (LCP < 2.5s, CLS < 0.1, TBT < 200ms).

Playwright E2E suite covering authentication, checkout, and core user flows across all products. OWASP ZAP automated security scan on every production deployment. Pact contract tests for all cross-service API boundaries. Percy visual regression baseline for all critical UI components.

Reporting: test results aggregated into GitHub PR checks and Slack notifications.

Outcome

Results

Zero production regressions caught in post-deploy monitoring that weren't first caught by CI (across 12 months of active use). Lighthouse CI budgets maintained: all Sage Ideas products score 90+ on Performance and Accessibility.

Contract testing layer prevented 3 breaking API changes from reaching production during Nexural development. Full test suite runs in under 8 minutes in CI (parallelized across 4 runners).

Testing infrastructure is a product decision, not a technical nicety. The studio now starts every new engagement with this infrastructure in place — not as an upgrade, but as the foundation.

Artifacts

Available

  • GitHub: Testing framework configuration templates
  • CI pipeline configuration
  • Lighthouse CI budget documentation
  • Test coverage policy documentation
References

Talk to people on this work.

No fabricated quotes. Reference contacts are shared during discovery, with both parties' consent.

Reference available

Engineering lead

Fintech · 5 years

Worked alongside on production trading systems for 5+ years. Available for technical reference calls — code quality, on-call discipline, incident behavior.

Reference call shared during discovery, both consenting.
Reference available

Founder

Studio engagement

Engaged Sage Ideas for a Ship + Operate combination. Willing to talk about scope discipline, timeline accuracy, and what handoff actually looked like.

Reference call shared during discovery, both consenting.
If the dashboard can't tell you whether last night's deploy was safe, it's wallpaper.
// build log · entry 04
Honesty

What almost happened.

Every project has near-misses. Decisions that, if we'd kept going, would have shipped a hole. The list below is the diff between the version that almost made it to prod and the version that did.

// near-miss · 01
diff
-
beforeCI was going to run all 13 frameworks on every PR. Pipeline wall time approached 40 minutes, devs started force-merging around it.
+
afterSelective execution by changed-path — a docs change runs Lighthouse + lint, a backend change runs unit + contract + Pact, a release branch runs all 13.
$
costA weekend writing the path-router. p50 PR time dropped from 38min to 7min.
// near-miss · 02
diff
-
beforeVisual regression was set to fail any pixel diff over 0.1%. Result: a font-rendering tweak in Chromium broke 200 snapshots overnight.
+
afterDiffing is structural — a Pixelmatch threshold tuned per surface, plus a ratchet that lets diffs land if a human reviewer ack'd them in the PR.
$
costTwo days of tuning. Zero false positives in 90 days.
From the repo

Inline excerpts.

Trimmed, but real. These are the patterns that made the system survive Stripe retries, multi-tenant queries, and a Discord bot that won't hallucinate positions.

Path-routed CI matrix
yaml
# .github/workflows/ci.yml — path filtering
on:
  pull_request:
    paths:
      - 'apps/**'
      - 'packages/**'
      - '.github/workflows/**'

jobs:
  detect:
    runs-on: ubuntu-latest
    outputs:
      ui: ${{ steps.f.outputs.ui }}
      api: ${{ steps.f.outputs.api }}
    steps:
      - uses: dorny/paths-filter@v3
        id: f
        with:
          filters: |
            ui: ['apps/web/**']
            api: ['apps/api/**']

  e2e:
    needs: detect
    if: needs.detect.outputs.ui == 'true'
    uses: ./.github/workflows/playwright.yml

  contract:
    needs: detect
    if: needs.detect.outputs.api == 'true'
    uses: ./.github/workflows/pact.yml
// Only the suites that can possibly be affected by the diff actually run.
livebuild 29be8ec2026-06-11 06:38Z
// solo studio// no analytics resold// every commit human-reviewed