Fixing Accidental Noindex Tags That Tanked Site Revenue

Fixing Accidental Noindex Tags That Tanked Site Revenue

How the Noindex Tag Ended Up in Production

This started with a quick tweak to a page template. Ironically, it was meant to be a cleanup. I wanted to remove some obsolete meta tags from AMP pages. Somewhere between VS Code autosaving, a weird file watcher race condition in my build script, and my pre-coffee brain, I ended up pushing out a <meta name="robots" content="noindex,nofollow"> on every mobile page template. Live. Crawled. Indexed—then dropped. Within hours, mobile impressions in Search Console flatlined like a monitor in a soap opera hospital scene.

Check your base templates. And don’t trust that your PR diff shows *rendered* output—this slipped by review because the meta was nested in a conditional component render edge case that didn’t fire in staging. The tag showed up only when the page had ads. Go figure.

Noindex Cache Lag in Google Search Console

You’d think removing the tag would reverse the damage within a day or two. Wrong. Google Search Console drags its heels on noindex cache invalidation, especially on mobile AMP variants. I removed the meta, verified live pages were re-servable, requested indexing manually, and… nothing for at least ten days. Even with the Indexing API for job posts (which I use on a non-news site just for freshness testing), pages stayed out of query results much longer than expected. Only when checking the URL Inspection tool directly on each AMP variant did I confirm they were forcefully excluded based on the prior noindex, even after removal.

Just a heads-up: Google’s own documentation is confusing about timelines. They say there’s no short-term penalty, but in practice, impressions won’t return until every cache updates and the stale state is evicted. For AMP, that means waiting for the AMP cache and the search record to update independently.

AMP Pages and Cached Meta Dysfunction

One of the weirder bugs I’ve run into here: AMP caches the entire rendered HTML HEAD, including the <meta name="robots"> tag. So even after your origin page is fixed, anyone loading the AMP Cache-hosted copy still sees the noindex tag—because it’s embedded in that cached version.

The only way I got it to flip was to:

  • Ensure the canonical version was indexable
  • Grab the AMP Cache link from Search Console’s AMP report
  • Ping it manually in the browser to force a fetch
  • Request indexing again on the canonical

I don’t have a solid explanation for why the manual AMP ping works, but around the third or fourth forced load from the Google cache URL, the meta tag updated. If there’s cache coalescing across AMP batches, you’re probably waiting for that to expire. There’s no timestamp exposed for AMP cache invalidation—one of those features Google added without planning for edge cases like a bad meta tag in the deploy pipeline.

Why Your Ads Completely Stopped Showing

AdSense didn’t warn me. Not a modal, not a dropdown badge, not even a revenue alert email. But once the AMP pages were marked noindex, AdSense placement logic deemed them “non-content” surfaces after about 48 hours. I only noticed when RPM dropped to almost zero and my AMP ad units started to return empty promises—literally <amp-ad> containers with nothing inside. Ad review center still showed active campaigns, and the ad unit stats were stuck in limbo with “0 impressions, 0 clicks.”

If you’re looking to verify this:

  • Use Chrome DevTools on a mobile emulator
  • Disable cache and preserve logs in the Network tab
  • Look for ads?&client= requests returning 204s or never firing
  • Check whether AMP script is throwing a visibilityState warning

I found one creepy detail: on some pages, I saw AMP silently skipping ad init because Googlebot (or a mobile equivalent) had previously marked them noindex. Once that signal gets associated with an AMP doc on their backend, ad auctions seem to halt until it’s cleared. It’s not in their docs. I pieced this together from a mix of gray-debug URLs and network timing anomalies.

OpenGraph + Noindex = Preview Kill

Another fun discovery: when a page is marked noindex, many platforms treat that as a do-not-render-any-preview directive—especially on Messenger, Slack, and certain social media apps. Even when I shared a fixed URL, if Facebook had previously scraped it during the noindex era, it kept the stale preview, and refused to re-scrape.

“The OG graph is cached for performance,” Facebook’s debugger spat out.

I had to manually resubmit the URL in Facebook’s Sharing Debugger twice for it to register the change. Slack had similar behavior. And here’s a weird bit: Telegram fetched my updated description but kept showing an expired og:image hosted on Cloudflare Images that had since 403’d.

Restore Visibility with Sitemap Overdrive

One of the few things that actually helped recrawl speed was blasting a manually curated sitemap of the affected AMP URLs. I generated it by parsing old Search Console exports with scrapey scripts, then dumped it into a standalone XML sitemap. I linked it in robots.txt aggressively and even submitted it directly via the Sitemap submission tool.

To be honest, this felt like overkill, but within 2–3 days of that submission—quicker than manual URL indexing requests—some of the long-dead AMP URLs started getting impressions again. Based on the logs, Googlebot fetched them via the sitemap entry, not internal links. So sitemaps do still matter once you mess up indexing on a large scale.

Cloudflare Cache Confusion on Fixed Pages

If you’re proxying via Cloudflare (I was), know that their edge cache might hold onto the stale meta tag even after your content is technically updated. Especially if you’re using their “Cache Everything” page rule on AMP URLs to improve TTFB. The upstream fix might be live, but incorrectly served from a stale data center.

Here’s what I had to do:

  • Use curl -I https://example.com/amp-page with and without ?cachebuster
  • Look for a Cloudflare cache status: HIT vs MISS in headers
  • Purge single URL via dashboard if it’s HIT
  • Disable any “Always Online” fallback temporarily

The edge purge lag was around 3 hours for me before I could confirm noindex was actually removed on served pages.

Logs Reveal the Noindex Footprint Trail

An unexpected server log detail that saved me

My nginx logs had one breadcrumb: a spike in Googlebot Mobile requests on the same day impressions died. Digging deeper into user agent patterns, I noticed every hit included +AMP in the UA string for that subset. And every one returned 200—no errors, no preloading conflicts—but I saw zero ad script requests shortly after.

The moment I looked at those requests side-by-side with URL inspection reports, I realized: the noindex tag went out approximately 38 minutes before the crawl, which was enough. The index flag had been flipped almost instantly. I had to literally roll back my deploy SHA and re-trigger Cloud Build with a cache-bypass header to make sure the stale meta didn’t sneak back in.

Live lesson: ad scripts won’t load if Googlebot classifies a page as low-priority or explicitly excluded, even if the browser render looks fine. It’s all trust signals—once disrupted, things cascade.

Edge Case: Canonical and AMP Versions Disagree

You might think setting the canonical to your main HTML page is enough protection. Not if the AMP version is served with a mismatched meta tag. In my case, the canonical was indexable, but the AMP had a stale noindex meta still cached. Google treated them as separate documents due to the tag mismatch—indexed AMP as null and canonical as undecided. In the words of the Indexing Status report: “Alternate page with proper canonical tag”, which is a black hole status, not a recovery position.

The fix: align all directive tags across variants. Same <title>, same <meta name="robots">, same canonical pointer. The only clue I got this was working? An amphtml variant being reindexed appeared in Discover—which had previously dropped AMP entirely for months.

Similar Posts