Locale-Aware Sites Get Better Indexing by Google


If you have a site with an international reach that requires multiple language versions then Google recommends having multiple URLs that are language and country specific in order for all of your site versions to be indexed properly. However, instead of separate URLs, some websites are locale-aware, which means they automatically detect the user's language and country settings and supply the corresponding content at the same URLs.

In the past, Googlebot has not been able to index all versions of those locale-aware sites because they did not search using IP addresses outside of the United States nor did they crawl pages with a language request (aka Accept-Language HTTP header). If this applies to your site then we have good news for you. Google is currently in the process of fixing the indexing problem with two new crawling methods for locale-aware sites:

  • Geo-distributed crawling - Googlebots will use IP addresses that appear outside of the United States in addition to the US based ones so they can be served up multiple versions of the site for indexing.
  • Language-dependent crawling - Googlebot will start crawling with an Accept-Language HTTP header in the request.

You can dive in for a deeper understanding on structuring a locale aware website here.

...

TO READ THE FULL ARTICLE