Screenshot from Zalando, August 2024
When you search in Google [gray t-shirts] you can see Zalando’s facet page ranking in the top #10.
Screenshot from search for [gray t-shirts], Google, August 2024If you try to add another filter over a gray t-shirt, let’s say the brand name ‘Adidas,’ you will get a new SEO-friendly URL with canonical meta tags and proper hreflangs for multiple languages in the source code
https://www.zalando.co.uk/t-shirts/adidas_grey/
<link rel=”canonical” href=”https://www.zalando.co.uk/t-shirts/adidas_grey/”>
<link rel=”alternate” hreflang=”en-de” href=”https://en.zalando.de/t-shirts/adidas_grey/”>
<link rel=”alternate” hreflang=”en-gb” href=”https://www.zalando.co.uk/t-shirts/adidas_grey/”>
<link rel=”alternate” hreflang=”en-ie” href=”https://www.zalando.ie/t-shirts/adidas_grey/”>
However, if you decide to include a copy on those pages, make sure you change the H1 tag and copy accordingly to avoid keyword cannibalization.
Noindex
Noindex tags can be implemented to inform bots of which pages not to include in the index.
For example, if you wished to include a page for “gray t-shirt” in the index, but did not want pages with price filter in the index, then a noindex tag to the second result would exclude it.
For example, if you have price filters that have these URLs…
https://www.exampleshop.com/t-shirts/grey/?price_from=82
…And if you don’t want them to appear in the index, you can use the “noindex” meta robots tag in the <head> tag:
<meta name=”robots” content=”noindex” />
This method tells search engines to “noindex” the page filtered by price.
Note that even if this approach removes pages from the index, there will still be crawl budget spent on them if search engine bots find those links and crawl these pages. For optimizing crawl budget, using robots.txt is the best approach.
Robots.txt
Disallowing facet search pages via robots.txt is the best way to manage crawl budget. To disallow pages with price parameters, e.g. ‘/?price=50_100’, you can use the following robots.txt rule.
Disallow: *price=*
This directive informs search engines not to crawl any URL that includes the ‘price=’ parameter, thus optimizing the crawl budget by excluding these pages.
However, if any outbound links pointing to any URL with that parameter in it existed, Google could still possibly index it. If the quality of those backlinks is high, you may consider using canonical approach to consolidate the link equity to a preferred URL.
Otherwise, you don’t need to worry about that, as Google confirmed they will drop over time.
Other Ways To Get The Most Out Of Faceted Navigation
Implement pagination with rel=”next” and rel=”prev” in order to group indexing properties from pages to a series as a whole.
Each page needs to link to children pages and parent. This can be done with breadcrumbs.
Only use canonical URLs in sitemaps in case you choose to canonicalize your facets search pages.
Include unique H1 tags and content in case of canonicalized facet URLs.
Facets should always be presented in a unified, logical manner (i.e., alphabetical order).
Implement AJAX for filtering to allow users to see results without reloading the page. However always change the URL after filtering so users can bookmark their searched pages and visit them later. Never implement AJAX without changing the URL.
Make sure faceted navigation is optimized for all devices, including mobile, through responsive design.
Conclusion
Although faceted navigation can be great for UX, it can cause a multitude of problems for SEO.
Duplicate content, wasted crawl budget, and diluted link equity can all cause severe problems on a site. However, you can fix those issues by applying one of the strategies discussed in this article.
It is crucial to carefully plan and implement facet navigation in order to avoid many issues down the line when it comes to faceted navigation.
More resources:
Featured Image: RSplaneta/Shutterstock
All screenshots taken by author