Faceted navigation is a common feature of websites that allows its visitors to change how items (for example, products, articles, or events) are displayed on a page. It’s a popular and useful feature, however its most common implementation, which is based on URL parameters, can generate infinite URL spaces which harms the website in a couple ways:
A typical faceted navigation URL may contain various parameters in the query string related to the properties of items they filter for. For example:
https://example.com/items.shtm?products=fish&color=radioactive_green&size=tiny
Changing any of the URL parameters products
, color
, and size
would show a different set of items on the underlying page. This often means a very large number of possible combinations of filters, which translates to a very large number of possible URLs. To save your resources, we recommend dealing with these URLs one of the following ways:
If you want to save server resources and you don’t need your faceted navigation URLs to show up in Google Search, you can prevent crawling of these URLs with one of the following ways.
user-agent: Googlebot disallow: /*?*products= disallow: /*?*color= disallow: /*?*size= allow: /*?products=all$
https://example.com/items.shtm#products=fish&color=radioactive_green&size=tiny
Other ways to signal a preference of which faceted navigation URLs (not) to crawl is using rel="canonical"
link
element and the rel="nofollow"
anchor attribute. However, these methods are generally less effective in the long term than the previously mentioned methods.
rel="canonical"
to specify which URL is the canonical version of a faceted navigation URL may, over time, decrease the crawl volume of non-canonical versions of those URLs. For example, if you have 3 filtered page types, consider pointing the rel="canonical"
to the unfiltered version: https://example.com/items.shtm?products=fish&color=radioactive_green&size=tiny
specifies <link rel="canonical" href="https://example.com/items.shtm?products=fish" >
.rel="nofollow"
attributes on anchors pointing to filtered results pages may be beneficial, however keep in mind that every anchor pointing to a specific URL must have the rel="nofollow"
attribute in order for it to be effective.If you need your faceted navigation URLs to be potentially crawled and indexed, ensure you’re following these best practices to minimize the negative effects of crawling the large number of potential URLs on your site:
&
‘. Characters like comma (,
), semicolon (;
), and brackets ([
and ]
) are hard for crawlers to detect as parameter separators (because most often they’re not separators)./products/fish/green/tiny
, ensure that the logical order of the filters always stays the same and that no duplicate filters can exist.404
status code when a filter combination doesn’t return results. If there are no green fish in the site’s inventory, users as well as crawlers should receive a “not found” error with the proper HTTP status code (404
). This should also be the case if the URL contains duplicate filters or otherwise nonsensical filter combinations, and nonexistent pagination URLs. Similarly, if a filter combination has no results, don’t redirect to a common “not found” error page. Instead, serve a “not found” error with the 404
HTTP status code under the URL where it was encountered.Apply for your exclusive plan for free