Digging Deep Into Google Search Console SEO Data Quirks

Table of Contents

URL Inspection’s Crawl Date Is Not What You Think

Okay, first thing: when you run a URL through the Inspection tool and it triumphantly proclaims “Crawled: two days ago” — that’s not necessarily when Google Search last saw your page. That’s when Googlebot last crawled the HTML. If your page relies even slightly on JS (say, lazy-loaded data), that timestamp isn’t necessarily the one that reflects the actual user-visible version.

There was one time I kept trying to debug missing FAQ rich results for a client’s page that had pristine schema… for the rendered content. The Inspection tool said all was fine. But viewing the rendered HTML tab via “View Crawled Page” showed missing DOM segments. Turns out, Google never actually triggered the component hydration. Why? Because the JS fetch call depended on window.location.hash, and — here’s the kicker — the crawler stripped it.

So yeah. When debugging structured data or canonical issues, remember: the crawl date is about the initial fetch. It says nothing about when or if the JS processing happened.

Meta Description Changes Don’t Propagate Cleanly

Here’s the scenario. You tweak a title and meta on a page because Search Console tells you it’s underperforming in queries it should be killing. You crawl it. You index it. You beg. Then… nothing. The old snippet lingers in the SERP like a moldy Post-It. What gives?

The truth is, despite your newly minted meta tags, Google will often keep using the old snippet if it continues to detect high engagement. Worse, I’ve seen cases where the snippet GSC reports for a query isn’t any of the three variations I’ve tested. They build snippets live — often per query — and cache them selectively. GSC’s data is just one lens, and you’re not looking through the same glass they are.

If you really want to force a snippet swap, sometimes canonical rel juggling helps. Used judiciously. One time, changing the canonical from absolute to relative was all it took for a fresh scrape. It lasted approximately one payout cycle before the old one snapped back like a cursed redirect.

Odd Timing: When Performance Data Lags Behind True Impressions

So everyone leans on the Performance report for tracking what’s working, what’s spiking, and which queries are crawling up from the swamp. But catch this: there’s a notable disconnect between when impressions (the box in Charts) appear and when actual clicks start showing a day or two later.

Which leads to a dance of uncertainty when experimenting with schema or meta alternations. Once, I induced what looked like a CTR drop after deploying JSON-LD event markup. For a few chaotic, sweaty-palmed days, the click line dipped while impressions rose. I panicked. I reverted. Later found out the rollout had re-classified some URLs as “indeterminate” until recrawled, tanking their appearance per device type (mobile saw the event result; desktop didn’t). But the click datastream lagged by over 72 hours — it was temporary. I probably yanked it prematurely.

Pages Grouped by Similar Issues Can Show Opposite Outcomes

Within the Index Coverage and Enhancements reports, some cluster labels are outright misleading: the report may group 100 pages under, say, “Duplicate without user-selected canonical.” Sure. Technically correct. But you dig in and half of them are ranking fine. Some are even in Top 3 positions.

“Not serving as canonical” does not mean “not indexed” or “penalized.”

This hit me hard with a set of localized gym directories — same template, different zip codes. Most were flagged as duplicate-ish because of location overlap in the H1s. But half still showed up just fine. Google had promoted the one with better incoming internal links (from blog content) while quietly indexing the rest.

The GSC report made it look like an indexing disaster. It wasn’t. It was just how Google organizes buckets and makes the presentation… especially dramatized.

The Madness of Regex Filtering in Performance Reports

Regex support in the Query and Page filters is both a gift and a trap. First of all, it’s case-insensitive by default — unless it’s not. It’s not consistent per property. Like, I’ve had verified Search Console properties where ^contact us$ filtered exactly what I expected, and others where it matched 12 weird variants with upper-case U’s (why?).

But the real kicker? Regex queries won’t match if your query contains keywords in mixed tokenization forms. Yeah. Google’s query normalization messes with token sequences before the regex even sees them. Try matching “red-sofa” and “red sofa” with the same pattern — you’ll get different counts.

Practical Regex Filter Tips (from pain)

Use .* generously if you’re working with longer-tail queries; specificity hides data
Avoid anchoring unless necessary — ^ and $ limit partial matches unexpectedly
Watch out for invisible white space in copied query strings — they carry over!
Use Chrome’s DevTools Console to test regex; GSC doesn’t show syntax errors
You can’t use lookaheads or lookbehinds — it only supports RE2 regex

One of my most embarrassing moments? Trying to measure “keyword cannibalization” across plural/singular using b(keywords?)b. I forgot GSC strips whitespace tokens in staging. Ended up with empty results and thought the pages had vanished from search… for four days.

Sitemaps Not Updating? GSC Might Be Serving Old Crawls

Ever updated your sitemap, re-submitted it, and… nothing changes? That’s not your server’s fault. GSC uses a delayed snapshot of the response to balance crawl bursts. It quietly caches sitemap fetches and reruns them behind the scenes within a “fetch window.” You can see this if you log your sitemap hits in real-time — it won’t match when you hit Refresh in GSC or re-submit. Not even close.

I once had a multilingual site where only the English sitemap had updated entries. The Spanish and French ones were still showing 404s. Turns out, Google had throttled those alternate language sitemap endpoints as “redundant” after reading identical dateModified entries. Whole sections weren’t being crawled byte-wise because the server’s Last-Modified headers never refreshed.

Fix? Re-generate sitemaps with forced different modified timestamps. Apparently, Google uses delta diffing even on sitemap XML lines now (undocumented, I think). Seeing new timestamps finally re-pinged those variants.

Scarily Easy to Misinterpret Clicks vs. Position vs. CTR

Clicks are addictive to watch. So is average position. Problem is, they rarely correlate usefully without context. Page might have 1000 impressions and 5 clicks at position 2. Another might get 10k impressions and 30 clicks at position 12 — which one’s doing better? Depends.

The real problem is when multiple URLs rank for semi-synonymous queries and eat into each other’s CTR. You’ll see some URL tanking in clicks, obsess over its slug or meta title, and miss the fact that it lost the Featured Snippet to its own sibling page. Search Console rolls that de-duping into “alternate search appearance” clusters, and unless you’ve segmented by exact URL and query string using regex and filters, it’s lost in the slushpile.

The moment I realized this was from a JSON export where I filtered on a cluster of “how to clean AC units” queries. Found that two pages were splitting the lion’s share of click potential, but only one had schema tied to “HowTo” type. And yes — that one got the snippet, but the other page’s position average was used to calculate the group CTR. Completely obscured by default.

There’s No Bug Report or Alert for Soft 404 Relabeling

This one drove me bonkers. A relatively clean blog started hemorrhaging index stats about a week after we stripped pagination UTM trackers. Suddenly, pages that had been hailed as indexed fine were now marked soft-404 — with no visible error.

Turns out, Search Console flagged partial content loads as empty content when server uptime hit a CPU spike during Googlebot’s visit. But no retry crawl was triggered, because from the server’s POV it returned a 200 OK.

The clue: in the Coverage detail panel, those URLs showed zero content bytes received. Not zero indexed — zero received. But only for Mobile Smartphone crawler, not Desktop. And it never triggered a re-crawl. Found it by matching crawl timestamps to Apache logs. GSC provided no warning, no Retries button, nada — just quietly downgraded the entries. Had to force a re-fetch manually to clear them.

This is now something I spot-check in nginx logs every time Lighthouse scores start wandering too.