Site icon SEOPARK

Google’s Advice on Fixing Unwanted Indexed URLs

Google’s Advice on Fixing Unwanted Indexed URLs

An SEO posted details about a site audit in which he critiqued the use of a rel=canonical for controlling what pages are indexed on a site. The SEO proposed using noindex to get the pages dropped from Google’s index and then adding the individual URLs to robots.txt. Google’s John Mueller suggested a solution that goes in a different direction.

Site Audit Reveals Indexed Add To Cart URLs

An SEO audit uncovered that over half of the client’s 1.43k indexed pages were paginated and “add to shopping cart” URLs (the kind with question marks at the end of them). Google ignored the rel=canonical link attributes and indexed the pages, which illustrated the point that rel=canonical is just a hint and not a directive. Paginated in this case just means the dynamically generated URLs related to when a site visitor orders a page by brand or size or whatever (this is usually referred to as faceted navigation).

The add to shopping cart URLs  looked like this:

example.com/product/page-5/?add-to-cart=example
The client had implemented a rel=canonical link attribute to tell Google that another URL was the correct URL to index.

The SEO’s solution:

“How I plan on fixing this is to no-index all these pages and once that’s done block them in the robots.txt”

SEO Decisions Depend On Details

One of the most tired and boring SEO dad jokes is “it depends.” But saying “it depends” is no joke when it’s followed by what something depends on and that’s the crucial detail that John Mueller added to a LinkedIn discussion that already had 83 responses to it.

The original discussion, by an SEO who’d just finished an audit, addresses the technical challenges associated with controlling what gets crawled and indexed by Google and why rel=canonical is not an unreliable solution because it is a suggestion and not a directive.

A directive is a command that Google is obligated to follow, like a meta noindex rule. A rel=canonical link attribute is not a directive, it’s treated as a hint for Google to use for deciding what to index.

The problem that the original post described was about managing a high number of dynamically generated posts that were slipping into Google’s index.

John Mueller On Dealing With Unwanted Indexed URLs

Mueller’s take on the problem was to suggest the importance of reviewing the URLs for patterns that may give a clue as to why unwanted URLs are getting indexed and then applying a more granular (specific) solution.

He advised:

“You seem to have a lot of comments here already, so my 2 cents are more as a random bystander…

– I’d review the URLs for patterns and look at specifics, rather than to treat this as a random list of URLs that you want canonicalized. These are not random, using a generic solution won’t be optimal for any site – ideally you’d do something specific for this particular situation. Aka “it depends”.

– In particular, you seem to have a lot of ‘add to cart’ URLs – you can just block these with the URL pattern via robots.txt. You don’t need to canonicalize them, they should ideally not be crawled during a normal crawl (it messes up your metrics too).

– There’s some amount of pagination, filtering in URL parameters too – check out our documentation on options for that.

– For more technical rabbit holes, check out https://search-off-the-record.libsyn.com/handling-dupes-same-same-or-different “

Why Was Google Indexing URLs With Query Parameters?

A topic raised by multiple people in the LinkedIn discussion is the problem of Google indexing shopping cart URLs (add to shopping cart URLs). No answers were provided but it may be something particular to the shopping cart platform and solving that may be limited to the above described solutions.

Read John Mueller’s advice here.

Exit mobile version