{"id":"qla67sablodqcv8","title":"Google Search Console Crawled Not Indexed — Debug Checklist","slug":"gsc-crawled-not-indexed-debug-checklist","summary":"When Google Search Console reports pages as 'crawled not indexed', it means Google saw them but decided not to put them in the search results. I've built a practical checklist to debug why your content isn't making it into the index.","imageUrl":"https://briancrabtree.me/images/journal-gsc-crawled-not-indexed-debug-checklist.webp","category":"Web Development","date":"2026-06-18T18:00:00.000Z","featured":false,"likes":1,"author":"brian@briancrabtree.me","content":"<h2>Understanding Crawled Not Indexed</h2>\n\n<p>When Google Search Console flags pages as \"crawled not indexed,\" it's a specific signal. It means Googlebot visited the page, processed it, but chose not to include it in the search index. This isn't a crawl error; it's an indexing decision by Google's algorithms. I've put together a crawled not indexed fix checklist to systematically tackle these issues.</p>\n\n<p>This status indicates that the page wasn't blocked by robots.txt and didn't return a 4xx or 5xx error. Google just deemed it not valuable or relevant enough to index for search queries. It's a frustrating status because the page is accessible, but effectively invisible. I’ve seen this many times.</p>\n\n<p>The goal is to understand Google's quality and relevance thresholds. We need to identify why a page, despite being crawled, doesn't meet those standards. This checklist covers the common culprits I encounter when debugging these situations for clients and my own sites.</p>\n\n<h2>Server Response and Site Health</h2>\n\n<p>First, confirm your server consistently delivers a 200 OK status code. If Googlebot experiences intermittent 5xx errors or slow response times, it can decide to not index the page. Check your server logs and GSC's \"Crawl Stats\" report for any red flags.</p>\n\n<p>Ensure your DNS records are stable and resolving correctly. Any issues here can make your site appear unreliable, leading to indexing hesitations. A CDN can help with global availability and speed, but it won't fix a fundamentally flaky origin server. I monitor uptime aggressively for client sites.</p>\n\n<p>Verify the page loads quickly and completely for Googlebot. Use GSC's \"URL Inspection\" tool to \"Test Live URL\" and \"View crawled page.\" Look for any resource loading issues or render problems that might make the page appear broken or incomplete to Google.</p>\n\n<h2>Content Quality and User Value</h2>\n\n<p>Thin content is a common reason for non-indexing. If a page offers minimal unique information or simply rehashes content found elsewhere, Google is less likely to index it. This often happens with category pages, filtered results, or boilerplate legal text. Aim for substance that solves a user's problem.</p>\n\n<p>Duplicate content also triggers this. If an identical or near-identical page exists elsewhere on your site or another site, Google might choose not to index both. Use canonical tags strategically, but also consider consolidating or improving content to make each page distinct and valuable. I find myself cutting low-value pages more often than adding them.</p>\n\n<p>Beyond unique words, consider the actual user value. Does the page answer a query thoroughly? Does it provide a good user experience? Google prioritizes content that users will find genuinely helpful and engaging. If a human wouldn't bother with it, Google probably won't either.</p>\n\n<h2>Robots and Indexing Directives</h2>\n\n<p>A noindex tag, either in the HTML <head> or as an HTTP header, will prevent indexing. Double-check your page source for  or the X-Robots-Tag header. It’s an easy mistake to make during development and then forget to remove it.</p>\n\n<p>Confirm your robots.txt file isn't accidentally blocking Googlebot from crawling the page or its critical resources. A Disallow directive for a specific path or the entire site can prevent Google from even seeing the content, leading to \"blocked by robots.txt\" or \"crawled not indexed\" if it gets past a partial block. I've spent hours hunting down forgotten disallows.</p>\n\n<p>Canonical tags can also influence indexing. While not a direct block, if a canonical points to another URL, Google will likely index the canonicalized version, not your page. Ensure your canonicals accurately reflect your preferred version, or remove them if no canonicalization is needed.</p>\n\n<h2>Sitemaps and Internal Linking</h2>\n\n<p>A well-structured XML sitemap helps Google discover your pages, but it doesn't guarantee indexing. Submit your sitemap in GSC and check its status for errors. Ensure all important URLs are included and that the sitemap is kept up-to-date, especially for dynamic sites.</p>\n\n<p>Strong internal linking is crucial for discoverability and conveying page importance. If a page has few or no internal links pointing to it, Google might de-prioritize it for indexing. Relevant, descriptive anchor text on internal links helps Google understand the context and topic of the destination page. For example, a reference to an earlier post like /journal/semantic-html-landmarks-beyond-divs/ signals its relevance.</p>\n\n<p>I make sure that key content is no more than a few clicks from the homepage or main navigation. Pages buried deep in the site structure without strong internal links are less likely to be crawled frequently and indexed. Think of internal links as votes for page importance.</p>\n\n<h2>Page Performance and User Experience</h2>\n\n<p>Google uses Core Web Vitals as an important ranking factor, and poor performance can impact indexing decisions. Pages with slow loading times, high layout shifts, or unresponsive interactions might be de-prioritized. Use GSC's Core Web Vitals report to identify problem areas.</p>\n\n<p>A frustrating user experience, even if not directly a Core Web Vitals issue, can indirectly affect indexing. If users quickly bounce back to search results, Google observes this. Ensure your site is mobile-friendly, accessible, and provides a clear, intuitive journey for visitors.</p>\n\n<p>I view performance as a foundational element, not an optimization luxury. A fast, stable site tells Google that your content is professionally maintained and worth surfacing to users. Sluggish sites indicate a lack of care, which Google interprets as lower quality.</p>\n\n<h2>What I do next when pages stall</h2>\n\n<p>Once you've gone through this checklist and made changes, don't just wait. Use Google Search Console's \"URL Inspection\" tool to request re-indexing for the affected pages. This signals to Google that you've made updates and want them to re-evaluate the page. Be patient, indexing takes time.</p>\n\n<p>If the problem persists for specific types of pages, consider if they truly need to be indexed. Sometimes, the most practical fix is to noindex low-value pages that Google consistently ignores. Focus your efforts on high-value content that genuinely serves your audience. This can improve your overall site's quality signal to Google.</p>\n\n<p>For a deep dive into specific indexing challenges, especially with client-side rendered applications, I often consult my detailed guide on how I fix react SPA crawled not indexed issues. You can find that at /journal/react-spa-crawled-not-indexed-fix/. It lays out a comprehensive strategy for dynamic content.</p>","tags":["gsc","indexing","seo"],"views":1}