{"id":8131,"date":"2021-11-17T11:09:12","date_gmt":"2021-11-17T08:09:12","guid":{"rendered":"https:\/\/www.evenzia.com\/?p=8131"},"modified":"2023-07-26T17:23:06","modified_gmt":"2023-07-26T14:23:06","slug":"why-getting-indexed-by-google-is-so-difficult","status":"publish","type":"post","link":"https:\/\/www.evenzia.com\/fr\/why-getting-indexed-by-google-is-so-difficult\/","title":{"rendered":"Why Getting Indexed by Google is so Difficult"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"8131\" class=\"elementor elementor-8131\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-35696de4 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"35696de4\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-extended\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-65acadad sc_inner_width_none sc_layouts_column_icons_position_left\" data-id=\"65acadad\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-75bc8b4c sc_fly_static elementor-widget elementor-widget-text-editor\" data-id=\"75bc8b4c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<p>Every website relies on Google to some extent. It\u2019s simple: your pages get indexed by Google, which makes it possible for people to find you. That\u2019s the way things should go.<\/p>\n\n<p>However, that\u2019s not always the case.\u00a0<strong>Many pages\u00a0<\/strong><a href=\"https:\/\/developers.google.com\/search\/docs\/advanced\/crawling\/large-site-managing-crawl-budget\" target=\"_blank\" rel=\"noreferrer noopener nofollow\"><strong>never get indexed by Google<\/strong><\/a>.<\/p>\n\n<p>If you work with a website, especially a large one, you\u2019ve probably noticed that not every page on your website gets indexed, and many pages wait for weeks before Google picks them up.<\/p>\n\n<p>Various factors contribute to this issue, and many of them are the same factors that are mentioned with regard to ranking \u2014 content quality and links are two examples. Sometimes, these factors are also very complex and technical. Modern websites that rely heavily on new web technologies have notoriously\u00a0<a href=\"https:\/\/www.seroundtable.com\/google-issues-indexing-disqus-comments-29641.html\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">suffered from indexing issues in the past<\/a>, and some still do.<\/p>\n\n<p>Many SEOs still believe that it\u2019s the very technical things that prevent Google from indexing content, but this is a myth. While it\u2019s true that Google might not index your pages if you don\u2019t send consistent technical signals as to which pages you want indexed or if you have insufficient crawl budget, it\u2019s just as important that you\u2019re consistent with the quality of your content.<\/p>\n\n<p>Most websites, big or small, have lots of content that should be indexed \u2014 but isn\u2019t. And while things like JavaScript do make indexing more complicated, your website can suffer from serious indexing issues even if it\u2019s written in pure HTML. In this post, let\u2019s address some of the most common issues, and how to mitigate them.<\/p>\n\n<h2 class=\"wp-block-heading\">Reasons why Google isn\u2019t indexing your pages<\/h2>\n\n<p>Using a\u00a0<a href=\"https:\/\/www.ziptie.dev\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">custom indexing checker tool<\/a>, I checked a large sample of the most popular e-commerce stores in the US for indexing issues.\u00a0<strong>I discovered that, on average, 15% of their indexable product pages<\/strong>\u00a0cannot be found on Google.<\/p>\n\n<p>That result was extremely surprising. What I needed to know next was \u201cwhy\u201d: what are the most common reasons why Google decides not to index something that should technically be indexed?<\/p>\n\n<p>Google Search Console reports several statuses for unindexed pages, like \u201cCrawled &#8211; currently not indexed\u201d or \u201cDiscovered &#8211; currently not indexed\u201d. While this information doesn\u2019t explicitly help address the issue, it\u2019s a good place to start diagnostics.<\/p>\n\n<h3 class=\"wp-block-heading\">Top indexing issues<\/h3>\n\n<p><a href=\"https:\/\/www.searchenginejournal.com\/page-indexing-issues\/398606\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Based on a large sample of websites I collected<\/a>, the most popular indexing issues reported by Google Search Console are:<\/p>\n\n<h4 class=\"wp-block-heading\">1. \u201cCrawled &#8211; currently not indexed\u201d<\/h4>\n\n<p>In this case, Google visited a page but didn\u2019t index it.<\/p>\n\n<p>Based on my experience, this is usually a content quality issue. Given the\u00a0<a href=\"https:\/\/news.un.org\/en\/story\/2021\/05\/1091182\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">e-commerce boom that\u2019s currently happening<\/a>, we can expect Google to get pickier when it comes to quality. So if you notice your pages are \u201cCrawled &#8211; currently not indexed\u201d, make sure the content on those pages is uniquely valuable:<\/p>\n\n<ul class=\"wp-block-list\">\n<li>Use unique titles, descriptions, and copy on all indexable pages.<\/li>\n<li>Avoid copying product descriptions from external sources.<\/li>\n<li>Use canonical tags to consolidate duplicate content.<\/li>\n<li>Block Google from crawling or indexing low-quality sections of your website by using the robots.txt file or the noindex tag.<\/li>\n<\/ul>\n\n<p>If you are interested in the topic, I recommend reading Chris Long\u2019s\u00a0<a href=\"https:\/\/moz.com\/blog\/crawled-currently-not-indexed-coverage-status\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Crawled \u2014 Currently Not Indexed: A Coverage Status Guide<\/a>.<\/p>\n\n<h4 class=\"wp-block-heading\">2. \u201cDiscovered &#8211; currently not indexed\u201d<\/h4>\n\n<p>This is my favorite issue to work with, because it can encompass everything from crawling issues to insufficient content quality. It\u2019s a massive problem, particularly in the case of large e-commerce stores, and I\u2019ve seen this apply to tens of millions of URLs on a single website.<\/p>\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/moz.com\/images\/blog\/pasted-image-0.png?w=1600&amp;auto=compress%2Cformat&amp;fit=crop&amp;fp-x=0.5&amp;fp-y=0.5&amp;dm=1637018297&amp;s=b74b0627be4fa11544160ace92eb572e\" alt=\"\" \/><\/figure>\n\n<p>Google may report that e-commerce product pages are \u201cDiscovered &#8211; currently not indexed\u201d because of:<\/p>\n\n<ul class=\"wp-block-list\">\n<li><strong>A crawl budget issue<\/strong>: there may be too many URLs in the crawling queue and these may be crawled and indexed later.<\/li>\n<li><strong>A quality issue<\/strong>: Google may think that some pages on that domain aren&rsquo;t worth crawling and decide not to visit them by looking for a pattern in their URL.<\/li>\n<\/ul>\n\n<p>Dealing with this problem takes some expertise. If you find out that your pages are \u201cDiscovered &#8211; currently not indexed\u201d, do the following:<\/p>\n\n<ol class=\"wp-block-list\">\n<li>Identify if there are patterns of pages falling into this category. Maybe the problem is related to a specific category of products and the whole category isn\u2019t linked internally? Or maybe a huge portion of product pages are waiting in the queue to get indexed?<\/li>\n<li>Optimize your crawl budget. Focus on spotting low-quality pages that Google spends a lot of time crawling. The usual suspects include filtered category pages and internal search pages \u2014 these pages can easily go into tens of millions on a typical e-commerce site. If Googlebot can freely crawl them, it may not have the resources to get to the valuable stuff on your website indexed in Google.<\/li>\n<\/ol>\n\n<p>During the webinar\u00a0<a href=\"https:\/\/www.youtube.com\/watch?v=1AA9lc7KGJY&amp;t=2421s\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">\u00ab\u00a0Rendering SEO\u00a0\u00bb<\/a>, Martin Splitt of Google gave us a few hints on fixing the Discovered not indexed issue. Check it out if you want to learn more.<\/p>\n\n<h4 class=\"wp-block-heading\">3. \u201cDuplicate content\u201d<\/h4>\n\n<p>This issue is extensively covered by the\u00a0<a href=\"https:\/\/moz.com\/learn\/seo\/duplicate-content\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Moz SEO Learning Center.<\/a>\u00a0I just want to point out here that duplicate content may be caused by various reasons, such as:<\/p>\n\n<ul class=\"wp-block-list\">\n<li>Language variations (e.g. English language in the UK, US, or Canada). If you have several versions of the same page that are targeted at different countries, some of these pages may end up unindexed.<\/li>\n<li>Duplicate content used by your competitors. This often occurs in the e-commerce industry when several websites use the same product description provided by the manufacturer.<\/li>\n<\/ul>\n\n<p>Besides using rel=canonical, 301 redirects, or creating unique content, I would focus on providing unique value for the users. Fast-growing-trees.com would be an example. Instead of boring descriptions and tips on planting and watering, the website allows you to see a detailed FAQ for many products.<\/p>\n\n<p>Also, you can easily compare between similar products.<\/p>\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/moz.com\/images\/blog\/pasted-image-0-1.png?w=1468&amp;auto=compress%2Cformat&amp;fit=crop&amp;fp-x=0.5&amp;fp-y=0.5&amp;dm=1637018282&amp;s=6b63346b7cd698fd4f9a008cbbe25bdc\" alt=\"\" \/><\/figure>\n\n<p>For many products, it provides an FAQ. Also, every customer can ask a detailed question about a plant and get the answer from the community.<\/p>\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/moz.com\/images\/blog\/pasted-image-0-2.png?w=1600&amp;auto=compress%2Cformat&amp;fit=crop&amp;fp-x=0.5&amp;fp-y=0.5&amp;dm=1637018288&amp;s=7597fe867904526240a5d738f44cc59c\" alt=\"\" \/><\/figure>\n\n<h2 class=\"wp-block-heading\">How to check your website\u2019s index coverage<\/h2>\n\n<p>You can easily check how many pages of your website aren\u2019t indexed by opening the<strong>\u00a0Index Coverage report\u00a0<\/strong>in Google Search Console.<\/p>\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/moz.com\/images\/blog\/pasted-image-0-3.png?w=1128&amp;auto=compress%2Cformat&amp;fit=crop&amp;fp-x=0.5&amp;fp-y=0.5&amp;dm=1637018293&amp;s=d43a50dec15bcb8e9e003c979832c4f8\" alt=\"\" \/><\/figure>\n\n<p>The first thing you should look at here is the number of excluded pages. Then try to find a pattern \u2014 what types of pages don\u2019t get indexed?<\/p>\n\n<p>If you own an e-commerce store, you\u2019ll most probably see unindexed product pages. While this should always be a warning sign, you can\u2019t expect to have all of your product pages indexed, especially with a large website. For instance, a large e-commerce store is bound to have duplicate pages and expired or out-of-stock products. These pages may lack the quality that would put them at the front of Google&rsquo;s indexing queue (and that\u2019s if Google decides to crawl these pages in the first place).<\/p>\n\n<p>In addition, large e-commerce websites tend to have issues with\u00a0<a href=\"https:\/\/moz.com\/blog\/crawl-budget\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">crawl budget<\/a>. I\u2019ve seen cases of e-commerce stores having more than a million products while 90% of them were classified as \u201cDiscovered &#8211; currently not indexed\u201d. But if you see that important pages are being excluded from Google\u2019s index, you should be deeply concerned.<\/p>\n\n<h2 class=\"wp-block-heading\">How to increase the probability Google will index your pages<\/h2>\n\n<p>Every website is different and may suffer from different indexing issues. However, here are some of the best practices that should help your pages get indexed:<\/p>\n\n<p><strong>1. Avoid the \u201cSoft 404\u201d signals<\/strong><\/p>\n\n<p>Make sure your pages don\u2019t contain anything that may falsely indicate a soft 404 status. This includes anything from using \u201cNot found\u201d or \u201cNot available\u201d in the copy to having the number \u201c404\u201d in the URL.<\/p>\n\n<p><strong>2. Use internal linking<\/strong><br \/>Internal linking is one of the key signals for Google that a given page is an important part of the website and deserves to be indexed. Leave no orphan pages in your website\u2019s structure, and remember to include all indexable pages in your sitemaps.<\/p>\n\n<p><strong>3. Implement a sound crawling strategy<\/strong><br \/>Don\u2019t let Google crawl cruft on your website. If too many resources are spent crawling the less valuable parts of your domain, it might take too long for Google to get to the good stuff. Server log analysis can give you the full picture of what Googlebot crawls and how to optimize it.<\/p>\n\n<p><strong>4. Eliminate low-quality and duplicate content<\/strong><br \/>Every large website eventually ends up with some pages that shouldn\u2019t be indexed. Make sure that these pages don\u2019t find their way into your sitemaps, and use the noindex tag and the robots.txt file when appropriate. If you let Google spend too much time in the worst parts of your site, it might underestimate the overall quality of your domain.<\/p>\n\n<p><strong>5. Send consistent SEO signals.<\/strong><br \/>One common example of sending inconsistent SEO signals to Google is altering canonical tags with JavaScript. As\u00a0<a href=\"https:\/\/www.youtube.com\/watch?v=bAE3L1E1Fmk&amp;t=772s\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Martin Splitt of Google mentioned<\/a>\u00a0during JavaScript SEO Office Hours, you can never be sure what Google will do if you have one canonical tag in the source HTML, and a different one after rendering JavaScript.<\/p>\n\n<h2 class=\"wp-block-heading\">The web is getting too big<\/h2>\n\n<p>In the past couple of years, Google has made giant leaps in processing JavaScript, making the job of SEOs easier. These days, it\u2019s less common to see JavaScript-powered websites that aren\u2019t indexed because of the specific tech stack they\u2019re using.<\/p>\n\n<p>But can we expect the same to happen with the indexing issues that aren\u2019t related to JavaScript? I don\u2019t think so.<\/p>\n\n<p>The internet is constantly growing. Every day new websites appear, and existing websites grow.<\/p>\n\n<p>Can Google deal with this challenge?<\/p>\n\n<p>This question appears every once in a while. I like\u00a0quoting Google\u00a0here:<\/p>\n\n<p>\u201cGoogle has a finite number of resources, so when faced with the nearly infinite quantity of content that&rsquo;s available online, Googlebot is only able to find and crawl a percentage of that content. Then, of the content we&rsquo;ve crawled, we&rsquo;re only able to index a portion.\u200b\u201d<\/p>\n\n<p>To put it differently, Google is able to visit just a portion of all pages on the web and index an even smaller portion. And even if your website is amazing, you should keep that in mind.<\/p>\n\n<p>Google probably won\u2019t visit every page of your website, even if it\u2019s relatively small. Your job is to make sure that Google can discover and index pages that are essential for your business.<\/p>\n\n<p>Source: https:\/\/moz.com\/blog\/why-getting-indexed-is-difficult<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Every website relies on Google to some extent. It\u2019s simple: your pages get indexed by Google, which makes it possible for people to&hellip;<\/p>\n","protected":false},"author":1,"featured_media":8132,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-8131","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"_links":{"self":[{"href":"https:\/\/www.evenzia.com\/fr\/wp-json\/wp\/v2\/posts\/8131","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.evenzia.com\/fr\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.evenzia.com\/fr\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.evenzia.com\/fr\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.evenzia.com\/fr\/wp-json\/wp\/v2\/comments?post=8131"}],"version-history":[{"count":4,"href":"https:\/\/www.evenzia.com\/fr\/wp-json\/wp\/v2\/posts\/8131\/revisions"}],"predecessor-version":[{"id":8422,"href":"https:\/\/www.evenzia.com\/fr\/wp-json\/wp\/v2\/posts\/8131\/revisions\/8422"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.evenzia.com\/fr\/wp-json\/wp\/v2\/media\/8132"}],"wp:attachment":[{"href":"https:\/\/www.evenzia.com\/fr\/wp-json\/wp\/v2\/media?parent=8131"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.evenzia.com\/fr\/wp-json\/wp\/v2\/categories?post=8131"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.evenzia.com\/fr\/wp-json\/wp\/v2\/tags?post=8131"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}