An SEO posted details about a site audit in which he critiqued the use of a rel=canonical for controlling what pages are indexed on a site. The SEO proposed using noindex to get the pages dropped from Google’s index and then adding the individual URLs to robots.txt. Google’s John Mueller suggested a solution that goes in a different direction.
Site Audit Reveals Indexed Add To Cart URLs
An SEO audit uncovered that over half of the client’s 1.43k indexed pages were paginated and “add to shopping cart” URLs (the kind with question marks at the end of them). Google ignored the rel=canonical link attributes and indexed the pages, which illustrated the point that rel=canonical is just a hint and not a directive. Paginated in this case just means the dynamically generated URLs related to when a site visitor orders a page by brand or size or whatever (this is usually referred to as faceted navigation).
The add to shopping cart URLs looked like this:
example.com/product/page-5/?add-to-cart=example
The client had implemented a rel=canonical link attribute to tell Google that another URL was the correct URL to index.
The SEO’s solution:
“How I plan on fixing this is to no-index all these pages and once that’s done block them in the robots.txt”
SEO Decisions Depend On Details
One of the most tired and boring SEO dad jokes is “it depends.” But saying “it depends” is no joke when it’s followed by what something depends on and that’s the crucial detail that John Mueller added to a LinkedIn discussion that already had 83 responses to it.
The original discussion, by an SEO who’d just finished an audit, addresses the technical challenges associated with controlling what gets crawled and indexed by Google and why rel=canonical is not an unreliable solution because it is a suggestion and not a directive.
A directive is a command that Google is obligated to follow, like a meta noindex rule. A rel=canonical link attribute is not a directive, it’s treated as a hint for Google to use for deciding what to index.
The problem that the original post described was about managing a high number of dynamically generated posts that were slipping into Google’s index.