Google’s Updated Crawler Guidance Recommends ETags

Google’s Updated Crawler Guidance Recommends ETags

Google announced an update to their crawler documentation, adding more information about caching which should help better understand how to optimize for Google’s crawler. By following the new guidelines on implementing proper HTTP caching headers, SEOs and publishers can improve crawling efficiency and optimize server resources.

Updated Crawler Documentation

The crawler documentation now has a section that explains how Google’s crawlers use HTTP caching mechanisms that help to conserve computing resources for both publishers and Google during crawling.

Additions to the documentation significantly expand on the prior version.

Caching Mechanisms

Google recommends enabling caching with headers like ETag and If-None-Match, as well as optionally Last-Modified and If-Modified-Since, to signal whether content has changed. This can help reduce unnecessary crawling and save server resources, which is a win for both publishers and Google’s crawlers.

The new documentation states:

“Google’s crawling infrastructure supports heuristic HTTP caching as defined by the HTTP caching standard, specifically through the ETag response- and If-None-Match request header, and the Last-Modified response- and If-Modified-Since request header.”

Google’s Preference For Preference for ETag

Google recommends using ETag over Last-Modified because ETag is less prone to errors like date formatting issues and provides more precise content validation. It also explains what happens if both ETag and Last-Modified response headers are served:

“If both ETag and Last-Modified response header fields are present in the HTTP response, Google’s crawlers use the ETag value as required by the HTTP standard.”

The new documentation also states that other HTTP caching directives are not supported.

Variable Support Across Crawlers

The new documentation explains that support for caching differs among Google’s crawlers. For example, Googlebot supports caching for re-crawling, while Storebot-Google has limited caching support.

Tinggalkan Balasan

Alamat email Anda tidak akan dipublikasikan. Ruas yang wajib ditandai *