Reading Time: 6 minutes

Who knew that a sitemap, the very thing designed to help Google find your pages, could itself go missing in action, or error out on you?
That’s precisely what happened when working on a WordPress sitemap recently.
What looked like a simple submission to Google Search Console turned into a game of “Guess what’s breaking the XML today.” Spoiler: it wasn’t Google’s fault, and it wasn’t Yoast SEO’s fault either—it was a sneaky few bytes standing in the way.

Why the Sitemap Matters

If you run a WordPress website, the sitemap is your calling card for search engines. It tells crawlers where to go, what to index, and what not to bother with. A proper WordPress sitemap helps with:

  • Faster discovery of new pages and posts
  • Better understanding of site structure
  • Easier debugging of indexing issues

Yoast SEO/Rank Math SEO and other SEO plugins make generating dynamic one as simple as flipping a switch. Unless, of course, something else gets in the way.

When Google Says “Couldn’t Fetch”

The issue began when submitting a sitemap in Google Search Console. Instead of the green “Success” label, I was greeted with a glaring “Couldn’t Fetch.” Not exactly the result you want to show a client.

A quick browser check revealed the problem:

error on line 1 at column 7: XML declaration allowed only at the start of the document

Translation: something was sneaking in before the <?xml…?> line. And XML is not forgiving—it insists on being first in line.

When MU-Plugins Become MU-Heroes

At this point, it was time to roll up the sleeves and build a diagnostic plugin. A “sitemap sniffer,” if you will. By intercepting the sitemap response and logging the first bytes, I could see exactly what was being sent before <?xml.
And yes, it was the usual suspect: a hidden UTF-8 BOM or whitespace characters.

The Sitemap Stripper to the Rescue

The solution? Create a small MU-plugin that strips away any BOM or whitespace before the XML prolog. Here’s the trick in plain English:

  1. Detect sitemap requests.
  2. Buffer the output before it’s sent to the browser.
  3. Trim any BOM, spaces, or rogue characters.
  4. Return a squeaky-clean XML starting right at <?xml version="1.0"…?>.

Here’s the code I use, feel free to drop it in a file in the /wp-content/mu-plugins folder:

<?php
/**
 * Plugin Name: WPSA Param Canonical Redirect
 * Description: Captures selected query params (currency, wpm-testimonial) and redirects to the clean URL. Allows ?perfmatters for admins using Script Manager.
 * Version: 1.3.0
 * Author: WPservice.pro
 */

if ( ! defined( 'ABSPATH' ) ) {
    exit;
}

add_action( 'template_redirect', function () {

    // Front-end only.
    if ( is_admin() || wp_doing_ajax() || ( defined( 'REST_REQUEST' ) && REST_REQUEST ) ) {
        return;
    }

    // If Perfmatters Script Manager is requested, and the user is an admin, do nothing.
    // Perfmatters shows its UI only for capable users, so we allow it for manage_options.
    if ( array_key_exists( 'perfmatters', $_GET ) && is_user_logged_in() && current_user_can( 'manage_options' ) ) {
        return; // keep ?perfmatters intact for admins
    }

    // Params we want to capture & then remove from public URLs.
    // key => [cookie, max_len, regex_allowed]
    $params = array(
        'currency'        => array( 'cookie' => 'wpsa_currency',        'len' => 10, 'pattern' => '/[^A-Za-z0-9_-]/' ),
        'wpm-testimonial' => array( 'cookie' => 'wpsa_wpm_testimonial', 'len' => 64, 'pattern' => '/[^A-Za-z0-9_-]/' ),
    );

    // If neither supported param is present (and we didn't early-return above), nothing to do.
    $has_supported = false;
    foreach ( $params as $key => $rules ) {
        if ( isset( $_GET[ $key ] ) && $_GET[ $key ] !== '' ) {
            $has_supported = true;
            break;
        }
    }
    if ( ! $has_supported ) {
        // If ?perfmatters is present for a non-admin (bot/visitor), strip it too.
        if ( array_key_exists( 'perfmatters', $_GET ) ) {
            $redirect_url = remove_query_arg( 'perfmatters' );
            if ( $redirect_url && $redirect_url !== ( ( is_ssl() ? 'https://' : 'http://' ) . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'] ) ) {
                wp_safe_redirect( $redirect_url, 302 );
                exit;
            }
        }
        return;
    }

    // Store present params into cookies, then remove them from the URL.
    foreach ( $params as $key => $rules ) {
        if ( empty( $_GET[ $key ] ) ) {
            continue;
        }

        $raw = $_GET[ $key ];
        if ( is_array( $raw ) ) {
            $raw = reset( $raw );
        }

        $value = substr( preg_replace( $rules['pattern'], '', (string) $raw ), 0, (int) $rules['len'] );

        if ( $value !== '' ) {
            // 30 days; secure/httponly per scheme.
            setcookie(
                $rules['cookie'],
                $value,
                time() + 30 * DAY_IN_SECONDS,
                COOKIEPATH ?: '/',
                COOKIE_DOMAIN,
                is_ssl(),
                true
            );
            $_COOKIE[ $rules['cookie'] ] = $value;
        }
    }

    // Build redirect target: strip our params; if non-admin had ?perfmatters, strip that too.
    $strip_keys = array_keys( $params );
    if ( array_key_exists( 'perfmatters', $_GET ) && ! ( is_user_logged_in() && current_user_can( 'manage_options' ) ) ) {
        $strip_keys[] = 'perfmatters';
    }

    $redirect_url = remove_query_arg( $strip_keys );
    if ( ! $redirect_url ) {
        return;
    }

    $current_url = ( is_ssl() ? 'https://' : 'http://' ) . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'];
    if ( $redirect_url === $current_url ) {
        return;
    }

    wp_safe_redirect( $redirect_url, 302 );
    exit;

}, 1 );

With the stripper in place, the sitemap was finally appropriately displayed, and Google could fetch it without complaints. The site went from “Oops, couldn’t fetch” to “Success.”

How the Stripper Works

Runs very early
It hooks into muplugins_loaded → that means it runs before most other plugins and before Yoast actually outputs the sitemap.

Detects sitemap requests
It checks if the current URL contains “sitemap” or the Yoast sitemap query params. If not a sitemap → it does nothing.

$is_sitemap = (stripos($uri, 'sitemap') !== false) || isset($_GET['sitemap']) || isset($_GET['yoast-sitemap-index']);

Starts output buffering
ob_start() intercepts everything that WordPress/Yoast/other plugins would normally send to the browser.

Inside the buffer callback:

  • Logs the first 8 bytes of output to a file (sitemap-first-bytes.hex). This is so we can confirm if it’s a BOM.
  • Cleans the output:
    • ltrim($buffer, "\xEF\xBB\xBF \t\n\r\0\x0B"); removes:
      • UTF-8 BOM (\xEF\xBB\xBF)
      • Spaces, tabs, newlines, etc.
    • If after trimming, the buffer still doesn’t start with <?xml, it tries to find the first real <?xml tag and cuts everything before it.

Returns the cleaned buffer
The browser finally sees clean XML starting exactly at <?xml …?>.

Why It Works

Without the stripper, your sitemap response begins with hidden garbage (BOM/space).
The stripper catches the entire sitemap output just before it leaves PHP and forcefully:

  • Removes the BOM if present
  • Removes accidental whitespace
  • Ensures the very first thing sent is <?xml …?>

That’s why the browser now sees a valid XML prolog and stops complaining.

Important Notes

It’s a band-aid (an elegant one), not the cure → the real fix is to find the file/plugin/theme that’s injecting the BOM/whitespace. As this project was time-limited, I was only able to rule out the theme/caching, not the plugins (there were 50+ of those).
Something left for another time. And once the root cause is fixed, you can safely delete this MU plugin.

Why You Should Validate Your Sitemap

Even when your sitemap looks fine in the browser, it may still contain hidden problems like BOMs, whitespace, or invalid tags. That’s why it’s smart to run it through a validator.

One easy option is the XML Sitemap Validator. Simply paste your sitemap URL and it will point out if the XML is malformed. Think of it as a lie detector test for your sitemap—it doesn’t care if it “looks fine,” it checks whether it’s truly valid.

Validation saves you from endless guesswork, especially when Google Search Console is vague about why it rejected a sitemap.

Lessons From the Field

The irony of fixing a sitemap is that the sitemap itself wasn’t broken—Yoast SEO had generated it just fine. The problem came from outside: a BOM, a stray newline, or an overzealous cache layer injecting noise before the XML.

This wasn’t about rewriting the sitemap; it was about teaching the server to shut up until the XML had its turn.

Best Practices for Keeping Your WordPress Sitemap Healthy

  • Save files without BOM: Always save PHP files as UTF-8 (without BOM). Most code editors have this option.
  • Watch your plugins: A single misplaced echo or stray whitespace in a plugin can break the sitemap.
  • Exclude sitemaps from caching: Add /sitemap_index.xml and related patterns to WP Rocket, Breeze, or Cloudflare exclusion lists.
  • Validate regularly: Use a validator like XML Sitemap Validator to confirm your sitemap is error-free.
  • Keep a backup plugin handy: A diagnostic MU-plugin can save hours of guesswork when debugging sitemap issues.

The Punchline

After hours of debugging, the sitemap is now clean, valid, and happily served to Google. The takeaway? Even when your sitemap is “generated by Yoast,” that doesn’t mean the delivery will be flawless. Sometimes it needs a bodyguard—a tiny MU-plugin that ensures nothing photobombs your XML.

And yes, Google finally accepted the sitemap. Because in SEO, as in comedy, timing is everything—and the sitemap prolog has to come first.

Key Takeaway

A sitemap should never start with surprises. If Google can’t fetch your WordPress sitemap, don’t just blame Yoast—check for BOMs, whitespace, and caching layers interfering with XML output. A clean start is the only start that works.

FAQ

What is a sitemap in WordPress?

A sitemap is an XML file that lists important URLs of your site, helping search engines crawl and index them efficiently.

Why did my sitemap show “XML declaration allowed only at the start of the document”?

This happens when extra characters (like a BOM or whitespace) appear before the <?xml line, breaking the file’s validity.

Does caching affect WordPress sitemaps?

Yes. Some caching or minification plugins can alter sitemap output. Always exclude /sitemap_index.xml and related files from caching.

Should I use Yoast SEO or another plugin for sitemaps?

Yoast SEO/Rank Math SEO works fine in most cases. The issue is rarely with the SEO itself—it’s usually server-level output, encoding, or plugin conflicts, that corrupts the sitemap.

Feel free to reach out to us if you get stuck or if you’d like a professional to optimize your website speed. WordPress speed optimization service is our forte. Find out why we are top-ranked worldwide for it and why top WordPress plugin companies write about our speed optimization service.

Check out our speed optimization packages, improve your website performance, and join our satisfied clients list.

Disclaimer: This is NOT a paid article — no one paid me for it. However, this article may contain affiliate links that help WPservice.pro, and you may get a discount.