How I Use Performance Budgets Without Breaking My Stack

Table of Contents

Why Your Lighthouse Score Doesn’t Mean Your Site Is Fast

Let’s just get this out of the way: a 90+ score doesn’t mean users aren’t rage quitting your site on flaky mobile. Google Lighthouse is useful, but it also lies to you, a little. It’s running in pristine lab conditions. Empty cache. No real user jank. Like grading a restaurant on the menu design instead of speed during lunch rush.

I had a client proudly brag about their perfect Lighthouse 100. This was on a Gatsby static build with a massive Lottie background animation that paused rendering on Safari iOS. Took me two hours to convince them something was wrong. We ended up CPU throttling to 4x slower to simulate devices shipping to Brazil. Revealed layout shifts and idle time we never would’ve spotted otherwise.

The truth is, performance budgets have to match user context. Otherwise they’re cosmetic. Don’t just drop a 200kb JS limit in your CI and call it a day — set budgets that reflect your worst-case user scenario. 3G Android on a bus at 8pm kind of scenario.

Using WebPageTest for Granular Performance Budgets

If you’ve only ever poked around WebPageTest once and went “eh too many waterfalls”, give it another look. It’s the tool Lighthouse wishes it was. When I care about actual loading behavior — not simulated — this is what I go with.

You can test on devices. You can throttle the network. You can even record video of the first paint and see precisely when the main content becomes visible. No synthetic fudge. My favorite part? You can enforce your own custom performance budgets right in the JSON test settings:

{
  "custom": {
    "loadTime": 3000,
    "bytesIn": 500000,
    "speedIndex": 2000
  }
}

Massively helpful if you want hard thresholds to alert you when regressions creep in, particularly in CI setups. One dev I worked with used this to keep marketing from uploading 5MB hero images again. Brilliant.

CI Budget Failures That Don’t Fail the Build (Why?)

Fun fact: if you set up budgets using lighthouse-ci and run that in your GitHub Actions pipeline, it’ll happily report warnings in the output… while still passing the build. Because warnings aren’t failures. Cool cool cool.

This is baffling the first time it happens. You think your budget is strict, but it’s an honor system. Especially if you did something like:

--budgetPath=./budget.json --assertions=performance=error

But your budget violated the byte weight, and that wasn’t actually in the assertion config — totally ignored. The build sailed through. Took me a day to realize the assertion had to be explicitly mapped to your budget entries. Not just performance overall. Yeah.

Solution? Be specific. Add individual assertions for things like resource-summary.script.size, byte-efficiency.unminified-javascript, etc. Otherwise your budgets are just decorative.

Real User Monitoring over Synthetic Budgets

Your budget is always wrong until you see what a real user sees. RUM tells you if your performance hunch is backwards. When I finally piped in Core Web Vitals from actual users using Cloudflare Browser Insights, I found out the global average LCP was faster than we thought, but the 90th percentile was brutal. Which meant: our fancy font loading logic helped nobody in the tail. Worst-case users were still getting TTFB spikes. That shaped our budget priorities differently.

What helped:

Hooking up web-vitals JS library into our client-side app and pinging a lightweight endpoint (no analytics bloat)
Segmenting results by country, device, and first load vs. return visit
Prioritizing the P95 metrics instead of averages (you want the experience to suck less for the worst-off, not flatter your mean)

Synthetic tests are just a sketch. RUM is the real portrait.

Budgeting Fonts Without Going Insane

One of my sites dropped 200ms just by deleting a font weight we weren’t even using. Yeah. Not even rendered. I wanted to cry.

Fonts are sneaky. You think you’re loading Latin only, then one language toggle later and here comes Cyrillic… with 300k of glyphs. Font display and subset strategy should be part of your performance budget, but a lot of budget tools don’t track them reliably. Lighthouse’s resource-summary will lump fonts together, but it doesn’t flag if you’re loading 5 styles instead of 2.

If you’re self-hosting, aggressively subset using tools like pyftsubset. If you’re using Google Fonts, force-display swap and audit the stylesheet URL manually. I once found a double-encoded URL that loaded both regular and italic weights even though only one was in use due to a rogue em class on the homepage.

CDNs Can Skew Metrics Drastically

Serving from a CDN like CloudFront or Cloudflare will make your Lighthouse TTFB look great. But a budget based on that might be a mirage if your CDN regions don’t cover all your users well.

I did a rollout once where China-excluded traffic through a fallback origin server. Our CDN metrics all looked great, but users in Vietnam were seeing five-second page loads. We only noticed after we got error reports mentioning a net::ERR_TIMED_OUT repeatedly cropping up from Guangzhou.

And yeah, turns out a misconfigured CORS rule triggered retry loops in some browsers. From Lighthouse’s POV, everything looked snappy. But users were hitting the secondary origin with routing delays longer than our entire budget. RUM revealed the hole. Synthetic couldn’t.

The Build Tool You Choose Controls Your Budget Fate

I was using Webpack + budget.json for a while, but switched a static project to Vite and got sideswiped. Vite bundles everything fast, but the outputs are sometimes opaque. The default Vite build splits vendor chunks in weird ways. I had a budget for main.js being under 150k, but Vite sliced it into 3 files so the budget tool missed the total payload.

That’s your undocumented behavior of the day: some build tools emit multiple chunks that tools like Lighthouse treat as separate, even if your browser loads them essentially as one payload. Unless you aggregate those, your budgets are toothless.

It feels like cheating — and it kind of is. You can fix this by analyzing transfer size in devtools, not just file size in build output. And if you want to be strict, write a custom assertion script that totals all JS chunks that match a pattern (main*.*.js) and flags if sum exceeds threshold.

When Performance Budgets Break Accessibility

Ever tried to shave 10kb off and taken out the focus-ring helper script? Yeah. Me. I did that. User couldn’t tab through nav links afterward unless they were using a mouse. Because I didn’t realize we were using a JS polyfill for :focus-visible fallback in some browsers.

A budget win that became an accessibility fail real fast. Accessibility and performance aren’t always friends. Accordion widgets, ARIA toggles, even skip-link handlers — they aren’t free. Your budget can trick you into thinking “pure CSS” is better, but in reality, scripted interactivity is sometimes the accessible choice.

If your budget leads you to rip out JS that’s helping less visible users — you’ve budgeted incorrectly. Period.

JSON Logs That Tell the TruStory

I found one of the best debugging tools hiding inside a CI log from lighthouse-ci — if you store the output JSON and actually look at it (not just wait for the number summary), you get gems like this:

{
  "auditRefs": [
    {
      "id": "unused-javascript",
      "weight": 1,
      "group": "load-opportunities",
      "details": {
        "overallSavingsMs": 1200,
        "overallSavingsBytes": 44231
      }
    }
  ]
}

That right there told me an imported animation library — that we dumped in six months ago and never used — was costing us 44KB and over a second of eval time. Gone the next day. Budget intact. Don’t just read the score. Read the full Lighthouse JSON and mine those details.