SEO.co
← Free toolsavailableSM

Sitemap Validator

Fetch any XML sitemap and check structure, URL count, lastmod data, and common issues. Supports both URL-set and sitemap-index formats.

// your details

Free to use. Result and your contact details are emailed to our team.

// what we check

Why sitemap hygiene matters.

A clean XML sitemap is how you tell search engines which URLs you want crawled and when they were last updated. A malformed sitemap can silently hurt crawl efficiency and let stale URLs linger in the index. Search engines treat the file as a trust signal: when the URLs it lists are canonical, indexable, and accurately dated, crawlers spend their budget on the pages you actually care about.

01

Format integrity

Valid XML, correct namespaces, well-formed <url> and <loc> elements.

02

URL count limits

Google caps individual sitemaps at 50,000 URLs / 50MB. We flag over-limit files.

03

Lastmod accuracy

Search engines weight lastmod for prioritization. Stale or missing values hurt.

04

Sitemap index

Index files split large sitemaps. We resolve children and surface counts.

// how it works

From URL to verdict in four steps.

The validator mirrors the path a search engine takes when it discovers your sitemap — so the issues it surfaces are the same ones a crawler would hit.

01

Fetch

We request your sitemap URL with a real crawler user-agent and follow up to one redirect, exactly like Googlebot.

02

Parse

The XML is parsed against the sitemaps.org 0.9 schema — namespaces, encoding, and well-formedness all verified.

03

Resolve

If it's a sitemap-index, we walk each child <sitemap> reference and tally the URLs they point to.

04

Report

You get URL counts, lastmod coverage, size against Google's limits, and a list of structural issues.

// common errors

Six sitemap errors that quietly cost crawls.

These are the issues we see most often in real-world sitemaps — and exactly how to fix each one.

01

Over 50,000 URLs

cause

A single sitemap exceeds Google's hard cap of 50,000 URLs or 50MB uncompressed.

fix

Split into multiple sitemaps and reference them from a sitemap-index file.

02

Bad XML / encoding

cause

Unescaped ampersands, stray characters, or a missing namespace declaration break parsing.

fix

Escape entities (&amp;, &lt;), declare the xmlns, and serve as UTF-8.

03

Non-canonical URLs

cause

Sitemap lists http when the site is https, www vs non-www, or URLs that 301/404.

fix

List only canonical, 200-status URLs that exactly match your preferred host.

04

Stale or missing lastmod

cause

lastmod values never change or are absent, so crawlers can't prioritize fresh pages.

fix

Emit an accurate lastmod that reflects real content changes — not the build time.

05

Noindex / blocked URLs

cause

Sitemap includes URLs that are noindexed or disallowed in robots.txt — mixed signals.

fix

Only submit indexable URLs. Remove anything you don't want in the index.

06

Not referenced anywhere

cause

The sitemap exists but isn't in robots.txt or submitted in Search Console.

fix

Add a Sitemap: line to robots.txt and submit the file in Google Search Console.

// checklist

The clean-sitemap checklist.

  • One canonical version per URL — no http/https or www duplicates
  • Every listed URL returns 200 and is indexable (no noindex, no robots block)
  • Accurate <lastmod> dates that track real content changes
  • Under 50,000 URLs and 50MB per file; use an index above that
  • Referenced in robots.txt and submitted in Google Search Console
  • Gzip-compressed for large files to cut crawl bandwidth
// faq

Sitemap questions, answered.

What makes an XML sitemap valid?
A valid sitemap is well-formed XML that declares the sitemaps.org namespace, lists each URL inside a <url><loc> element, stays under 50,000 URLs and 50MB uncompressed, and contains only canonical, indexable URLs. Optional <lastmod>, <changefreq>, and <priority> tags must use the correct formats. Our validator checks each of these.
What's the difference between a sitemap and a sitemap index?
A regular sitemap lists page URLs directly. A sitemap index is a parent file that lists other sitemap files — used when you have more than 50,000 URLs or want to organize sitemaps by section (blog, products, images). Our tool detects which type you submitted and, for an index, resolves the child sitemaps and totals their URLs.
How many URLs can one sitemap contain?
Google and Bing both cap a single sitemap at 50,000 URLs or 50MB uncompressed, whichever comes first. If you exceed either limit, split the URLs across multiple sitemaps and reference them from a sitemap-index file. There's no practical limit on the number of sitemaps an index can hold.
Does a valid sitemap guarantee my pages get indexed?
No. A sitemap is a strong hint about which URLs you want crawled and how fresh they are, but Google decides what to index based on quality, crawl budget, and dozens of other signals. A clean sitemap improves crawl efficiency and discovery — it doesn't force indexation.
Should I include noindexed or redirected URLs in my sitemap?
No. A sitemap should list only canonical URLs that return a 200 status and are meant to be indexed. Including noindexed pages, redirects, or 404s sends mixed signals, wastes crawl budget, and can lower Google's trust in the file. Keep it clean.
Where should my sitemap live and how do I submit it?
Host it at a stable URL (commonly /sitemap.xml), add a `Sitemap:` line pointing to it in your robots.txt, and submit it in Google Search Console under Sitemaps. Bing Webmaster Tools accepts it the same way. Once submitted, both engines re-fetch it periodically.
Want a full crawl + index audit?
Our paid SEO audit checks sitemap-to-index coverage, robots.txt alignment, and 200+ ranking factors.
See audit services