John Mueller On Dealing With Unwanted Indexed URLs
Mueller’s take on the problem was to suggest the importance of reviewing the URLs for patterns that may give a clue as to why unwanted URLs are getting indexed and then applying a more granular (specific) solution.
He advised:
“You seem to have a lot of comments here already, so my 2 cents are more as a random bystander…
– I’d review the URLs for patterns and look at specifics, rather than to treat this as a random list of URLs that you want canonicalized. These are not random, using a generic solution won’t be optimal for any site – ideally you’d do something specific for this particular situation. Aka “it depends”.
– In particular, you seem to have a lot of ‘add to cart’ URLs – you can just block these with the URL pattern via robots.txt. You don’t need to canonicalize them, they should ideally not be crawled during a normal crawl (it messes up your metrics too).
– There’s some amount of pagination, filtering in URL parameters too – check out our documentation on options for that.
– For more technical rabbit holes, check out https://search-off-the-record.libsyn.com/handling-dupes-same-same-or-different “
Why Was Google Indexing URLs With Query Parameters?
A topic raised by multiple people in the LinkedIn discussion is the problem of Google indexing shopping cart URLs (add to shopping cart URLs). No answers were provided but it may be something particular to the shopping cart platform and solving that may be limited to the above described solutions.
Read John Mueller’s advice here.