LogBeast CrawlBeast Consulting Blog Glossary Download Free

Hreflang Tags: The Complete Guide to International SEO

Hreflang tags tell search engines which language and regional version of a page to serve. Get the implementation right with this complete guide to international SEO.

🌍
✨ Summarize with AI

What Are Hreflang Tags?

Hreflang is an HTML attribute introduced by Google in 2011 that tells search engines which language and regional variant of a page should be served to users in different locations. The attribute uses the rel="alternate" hreflang="x" pattern to create a mapping between equivalent pages across languages and regions.

Consider an e-commerce site selling shoes. A user in Germany searching for "Laufschuhe" should land on your German-language page with prices in euros, not your English-language page with prices in dollars. Without hreflang, Google has to guess which version to show, and it frequently guesses wrong -- especially when the content is structurally similar across versions.

Hreflang solves three specific problems:

🔑 Key Insight: Hreflang is a signal, not a directive. Google uses it as a strong hint for which version to show in search results, but it can override your hreflang annotations if other signals (like user location, search language settings, or page quality) contradict them. Getting hreflang right removes ambiguity and gives Google the clearest possible signal.

Why Hreflang Matters for SEO

The impact of correct hreflang implementation goes beyond just showing the right language. It directly affects traffic, rankings, and revenue for international sites.

Preventing Cannibalization

Without hreflang, Google may consolidate your regional pages into a single canonical. If you have separate pages for example.com/en-us/shoes and example.com/en-gb/shoes, Google might pick one as the canonical and suppress the other from search results entirely. The suppressed version loses all organic traffic. Hreflang explicitly tells Google these are intentional variants, not duplicates.

Improving Click-Through Rates

When users see search results in their own language with local pricing, they click at significantly higher rates. A study by CSA Research found that 76% of online shoppers prefer to buy products with information in their native language. Serving the wrong language version in search results means losing those clicks to competitors who got their hreflang right.

Reducing Bounce Rates

Landing on a page in the wrong language is one of the fastest ways to drive users away. If a Spanish-speaking user lands on your English page because Google picked the wrong version, that user will bounce immediately. High bounce rates on landing pages send negative engagement signals back to Google, creating a downward spiral for rankings.

ScenarioWithout HreflangWith Hreflang
German user searches "Laufschuhe"May see English page; bouncesSees German page with EUR pricing; converts
Two English variants (US/UK)Google picks one; other loses trafficBoth rank in respective regions
French-Canadian vs. French-FranceFrench-France page dominates both marketsEach version ranks in its target market
Same content, different currenciesFlagged as duplicate contentTreated as legitimate regional variants

Hreflang Syntax and Implementation

There are three ways to implement hreflang: HTML <link> tags in the <head>, HTTP headers, and XML sitemaps. Each method is equally valid in Google's eyes, but they suit different technical situations.

Method 1: HTML Link Tags

The most common approach. Add <link> elements to the <head> section of every page that has alternate language or regional versions. Every page must reference all its alternates, including itself.

<!-- On your English (US) page: example.com/en-us/shoes -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/shoes" />
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/shoes" />
<link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/schuhe" />
<link rel="alternate" hreflang="fr-FR" href="https://example.com/fr-fr/chaussures" />
<link rel="alternate" hreflang="es-ES" href="https://example.com/es-es/zapatos" />
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/shoes" />

The x-default value is special. It tells Google which page to show when none of the specified language/region combinations match the user. Typically this points to your primary English page or to a language-selector landing page.

🔑 Key Insight: The hreflang value uses ISO 639-1 for languages (two-letter codes like en, de, fr) and optionally ISO 3166-1 Alpha-2 for regions (like US, GB, DE). You can specify language only (hreflang="de") or language plus region (hreflang="de-DE"). You cannot specify region without language.

Method 2: HTTP Headers

Use HTTP headers when you cannot modify the HTML <head>, which is common for PDFs, images, or non-HTML resources. The syntax follows RFC 8288 (Web Linking).

# Nginx configuration for hreflang HTTP headers
location /en-us/shoes {
    add_header Link '<https://example.com/en-us/shoes>; rel="alternate"; hreflang="en-US"';
    add_header Link '<https://example.com/en-gb/shoes>; rel="alternate"; hreflang="en-GB"';
    add_header Link '<https://example.com/de-de/schuhe>; rel="alternate"; hreflang="de-DE"';
    add_header Link '<https://example.com/fr-fr/chaussures>; rel="alternate"; hreflang="fr-FR"';
    add_header Link '<https://example.com/en-us/shoes>; rel="alternate"; hreflang="x-default"';
}

# Apache .htaccess equivalent
<Files "shoes.pdf">
    Header add Link '<https://example.com/en-us/shoes.pdf>; rel="alternate"; hreflang="en-US"'
    Header add Link '<https://example.com/de-de/schuhe.pdf>; rel="alternate"; hreflang="de-DE"'
</Files>

HTTP headers work well for non-HTML content but become unwieldy for sites with many language variants. Managing headers for 15 language versions across thousands of URLs is operationally painful compared to generating the tags in your CMS templates.

Method 3: XML Sitemap

The XML sitemap approach is ideal for large sites with many language variants. Instead of adding hreflang to every page's HTML, you declare all the relationships in your sitemap. This is the method Google recommends for sites with more than a handful of language versions.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:xhtml="http://www.w3.org/1999/xhtml">

  <!-- English (US) version -->
  <url>
    <loc>https://example.com/en-us/shoes</loc>
    <xhtml:link rel="alternate" hreflang="en-US"
                href="https://example.com/en-us/shoes" />
    <xhtml:link rel="alternate" hreflang="en-GB"
                href="https://example.com/en-gb/shoes" />
    <xhtml:link rel="alternate" hreflang="de-DE"
                href="https://example.com/de-de/schuhe" />
    <xhtml:link rel="alternate" hreflang="fr-FR"
                href="https://example.com/fr-fr/chaussures" />
    <xhtml:link rel="alternate" hreflang="x-default"
                href="https://example.com/en-us/shoes" />
  </url>

  <!-- German (Germany) version -->
  <url>
    <loc>https://example.com/de-de/schuhe</loc>
    <xhtml:link rel="alternate" hreflang="en-US"
                href="https://example.com/en-us/shoes" />
    <xhtml:link rel="alternate" hreflang="en-GB"
                href="https://example.com/en-gb/shoes" />
    <xhtml:link rel="alternate" hreflang="de-DE"
                href="https://example.com/de-de/schuhe" />
    <xhtml:link rel="alternate" hreflang="fr-FR"
                href="https://example.com/fr-fr/chaussures" />
    <xhtml:link rel="alternate" hreflang="x-default"
                href="https://example.com/en-us/shoes" />
  </url>

  <!-- Repeat for each language version... -->
</urlset>

💡 Pro Tip: You can combine methods, but do not declare conflicting hreflang annotations across methods. If your HTML says the German alternate is /de/schuhe but your sitemap says it is /de-de/schuhe, Google will likely ignore both. Pick one method and stick with it, or ensure they are perfectly synchronized.

Common Hreflang Mistakes

Hreflang has a reputation for being difficult to implement correctly. Most of that reputation comes from a handful of mistakes that appear on the majority of international sites. Here are the errors that cause the most damage.

Missing Return Tags (Confirmation Links)

Hreflang annotations must be bidirectional. If Page A says "my German alternate is Page B," then Page B must also say "my English alternate is Page A." If either side of the link is missing, Google ignores the entire annotation for that pair.

<!-- WRONG: English page links to German, but German page
     does not link back to English -->

<!-- On /en/shoes (English page): -->
<link rel="alternate" hreflang="en" href="/en/shoes" />
<link rel="alternate" hreflang="de" href="/de/schuhe" />

<!-- On /de/schuhe (German page): -->
<link rel="alternate" hreflang="de" href="/de/schuhe" />
<!-- Missing: link back to /en/shoes! -->


<!-- CORRECT: Both pages link to each other -->

<!-- On /en/shoes (English page): -->
<link rel="alternate" hreflang="en" href="/en/shoes" />
<link rel="alternate" hreflang="de" href="/de/schuhe" />

<!-- On /de/schuhe (German page): -->
<link rel="alternate" hreflang="de" href="/de/schuhe" />
<link rel="alternate" hreflang="en" href="/en/shoes" />

This is the single most common hreflang error. It typically happens when different teams manage different language versions, or when a new language is added and existing pages are not updated to link to the new version.

Wrong Language and Region Codes

Using incorrect ISO codes silently breaks your hreflang. Google will not show an error; it will simply ignore the annotation.

Common MistakeIncorrect CodeCorrect CodeWhy
Using country for languagehreflang="uk"hreflang="en-GB""uk" is not a valid ISO 639-1 language code
Wrong region codehreflang="en-UK"hreflang="en-GB"The ISO 3166-1 code for United Kingdom is GB, not UK
Three-letter language codehreflang="fra"hreflang="fr"Hreflang requires ISO 639-1 (two-letter), not 639-2
Region without languagehreflang="DE"hreflang="de-DE"Language code is required; region alone is invalid
Wrong separatorhreflang="en_US"hreflang="en-US"Use hyphen, not underscore

Misusing x-default

The x-default value should point to a single fallback page -- typically a language selector or your primary market's page. Common mistakes include:

⚠️ Warning: Never use hreflang on pages that return non-200 status codes. If your German page returns a 301 redirect to a different URL, the hreflang annotation pointing to that German URL is invalid. All hreflang URLs must resolve to a 200 status and must match the canonical URL of the page.

Hreflang Conflicts with Canonical Tags

This is a subtle but devastating mistake. If your English-US page has a canonical tag pointing to the English-UK page, the hreflang annotation for the US version is ignored. The canonical tag says "this page is a duplicate of the UK page," which directly contradicts the hreflang saying "this is the US alternate." Each language/region variant must be self-canonicalizing (its canonical tag points to itself).

<!-- WRONG: US page canonicalizes to UK page -->
<link rel="canonical" href="https://example.com/en-gb/shoes" />
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/shoes" />

<!-- CORRECT: US page is self-canonical -->
<link rel="canonical" href="https://example.com/en-us/shoes" />
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/shoes" />

How to Verify Hreflang with Log Analysis

Implementing hreflang is only half the battle. You need to verify that Googlebot is actually crawling and processing your hreflang annotations correctly. Server logs are the most reliable way to do this because they show you exactly what Googlebot sees, unfiltered by the abstractions in Google Search Console.

What to Look for in Your Logs

When Googlebot encounters hreflang annotations, its behavior changes in observable ways:

# Extract Googlebot requests to hreflang-related URLs from access logs
# Group by language directory to see crawl distribution

# Nginx combined log format
grep "Googlebot" access.log \
  | awk '{print $7}' \
  | grep -oP '^/[a-z]{2}(-[a-z]{2})?/' \
  | sort | uniq -c | sort -rn

# Example output:
#   4521 /en-us/
#   3892 /en-gb/
#   2104 /de-de/
#   1987 /fr-fr/
#   1543 /es-es/
#    312 /ja-jp/    <-- Significantly lower; investigate

# Check for non-200 responses on alternate URLs
grep "Googlebot" access.log \
  | awk '$9 != 200 {print $9, $7}' \
  | grep -P '^/[a-z]{2}(-[a-z]{2})?/' \
  | sort | uniq -c | sort -rn

💡 Pro Tip: LogBeast makes this analysis trivial. Load your access logs, filter by Googlebot user agent, and instantly see crawl distribution across your language directories. You can spot hreflang problems in seconds -- like a language variant that stopped being crawled three days ago -- without writing a single grep command.

Detecting Hreflang Failures in Logs

There are three red flags to watch for:

  1. Uneven crawl ratios: If your English pages get 10x more Googlebot hits than your equally-sized German section, Google may not be following your hreflang links. A healthy ratio should roughly correspond to the number of pages in each language section
  2. 404s on alternate URLs: Filter your logs for Googlebot requests that return 404 on URLs matching your language path pattern. These are broken hreflang references
  3. Redirect chains on alternates: If Googlebot hits a 301 when following an hreflang link, the annotation is effectively broken. Look for 3xx responses on your language-prefixed paths

Hreflang and Crawl Budget

Every hreflang annotation increases the number of URLs Google needs to crawl. If you have 10,000 pages and 8 language versions, that is 80,000 URLs. Add x-default and you have 90,000. Google still needs to crawl all of them to validate the hreflang relationships, and every URL competes for your site's limited crawl budget.

Strategies to Manage Crawl Budget with Hreflang

🔑 Key Insight: For most sites under 100,000 pages, crawl budget is not a practical concern -- Google will crawl everything. Crawl budget becomes critical for sites with millions of pages across many language versions. If you are running a large international e-commerce site, the combination of hreflang and crawl budget optimization can mean the difference between thousands of pages being indexed or orphaned. Use LogBeast to monitor your crawl budget allocation across language segments.

Testing and Validation Tools

Before deploying hreflang to production, validate your implementation. After deploying, monitor it continuously. Here are the tools that matter.

Google Search Console

The International Targeting report in GSC shows hreflang errors that Google has detected during crawling. Common errors reported include:

GSC is useful but has a significant lag -- errors may take days or weeks to appear. It also does not show all errors, only a sample. For comprehensive validation, you need additional tools.

Crawl-Based Validation

Tools like CrawlBeast, Screaming Frog, and Sitebulb can crawl your site and validate hreflang at scale. A crawl-based audit will catch:

Manual Spot Checks

For quick validation of individual pages, view the page source and search for hreflang. Verify that:

  1. The page references itself with the correct hreflang value
  2. All alternate URLs are absolute (not relative)
  3. All alternate URLs return 200
  4. The language/region codes are valid ISO codes
  5. There is exactly one x-default (if used)
# Quick command-line validation: fetch a page and extract hreflang tags
curl -s https://example.com/en-us/shoes | grep -i hreflang

# Validate that all hreflang URLs return 200
curl -s https://example.com/en-us/shoes \
  | grep -oP 'href="[^"]*"' \
  | grep -i hreflang \
  | grep -oP '"[^"]*"' \
  | tr -d '"' \
  | while read url; do
      status=$(curl -o /dev/null -s -w "%{http_code}" "$url")
      echo "$status $url"
    done

# Expected output (all should be 200):
# 200 https://example.com/en-us/shoes
# 200 https://example.com/en-gb/shoes
# 200 https://example.com/de-de/schuhe
# 200 https://example.com/fr-fr/chaussures

💡 Pro Tip: Set up a weekly automated crawl of your hreflang pages using CrawlBeast. Hreflang errors often appear silently when a page is deleted, a redirect is added, or a new language version is launched without updating all existing pages. Catching these within a week prevents months of lost international traffic.

Real-World Examples: Multi-Region Setup

Theory is useful, but seeing how real multi-region sites structure their hreflang is more instructive. Here are two common architectures with their complete hreflang implementations.

Example 1: Subdirectory Structure

The most common approach for mid-size international sites. All language versions live under a single domain with path-based segmentation.

<!-- Site: example.com
     Markets: US (English), UK (English), Germany (German),
              France (French), Spain (Spanish), Japan (Japanese)
     Default: US English -->

<!-- On every page in every language version: -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/products/running-shoes" />
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/products/running-shoes" />
<link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/produkte/laufschuhe" />
<link rel="alternate" hreflang="fr-FR" href="https://example.com/fr-fr/produits/chaussures-de-course" />
<link rel="alternate" hreflang="es-ES" href="https://example.com/es-es/productos/zapatillas-running" />
<link rel="alternate" hreflang="ja-JP" href="https://example.com/ja-jp/products/running-shoes" />
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/products/running-shoes" />

This structure works well because all language versions share the same domain authority, the URL pattern makes the target market obvious, and a single XML sitemap can declare all hreflang relationships.

Example 2: ccTLD Structure

Enterprise sites often use country-code top-level domains for each market. This gives the strongest geo-targeting signal but requires more complex hreflang management because each domain is treated as a separate site by Google.

<!-- Site: example.com (US), example.co.uk (UK),
          example.de (Germany), example.fr (France) -->

<!-- On example.com/products/running-shoes: -->
<link rel="alternate" hreflang="en-US" href="https://example.com/products/running-shoes" />
<link rel="alternate" hreflang="en-GB" href="https://example.co.uk/products/running-shoes" />
<link rel="alternate" hreflang="de-DE" href="https://example.de/produkte/laufschuhe" />
<link rel="alternate" hreflang="fr-FR" href="https://example.fr/produits/chaussures-de-course" />
<link rel="alternate" hreflang="x-default" href="https://example.com/products/running-shoes" />

<!-- On example.de/produkte/laufschuhe (must mirror all links): -->
<link rel="alternate" hreflang="en-US" href="https://example.com/products/running-shoes" />
<link rel="alternate" hreflang="en-GB" href="https://example.co.uk/products/running-shoes" />
<link rel="alternate" hreflang="de-DE" href="https://example.de/produkte/laufschuhe" />
<link rel="alternate" hreflang="fr-FR" href="https://example.fr/produits/chaussures-de-course" />
<link rel="alternate" hreflang="x-default" href="https://example.com/products/running-shoes" />

With ccTLDs, the XML sitemap method becomes especially valuable. Each domain has its own sitemap, but they all need to reference URLs on the other domains. A common pattern is to generate a master hreflang sitemap centrally and deploy it to all domains simultaneously.

Example 3: Handling Language Without Region

Not every site needs regional targeting. A blog that publishes in English, Spanish, and Japanese -- but does not differentiate between regions -- can use language-only hreflang codes.

<!-- Language-only targeting (no regional variants) -->
<link rel="alternate" hreflang="en" href="https://blog.example.com/en/guide" />
<link rel="alternate" hreflang="es" href="https://blog.example.com/es/guia" />
<link rel="alternate" hreflang="ja" href="https://blog.example.com/ja/guide" />
<link rel="alternate" hreflang="x-default" href="https://blog.example.com/en/guide" />

This is simpler to maintain and appropriate when your content does not vary by region within the same language. Use language-plus-region codes only when you actually have region-specific differences (pricing, product availability, local regulations, spelling differences).

🔑 Key Insight: During a site migration, hreflang is often the first thing that breaks and the last thing that gets fixed. If you are migrating an international site, map every old hreflang URL to its new equivalent before the migration. After launch, validate all hreflang pairs within 24 hours using a full-site crawl. Broken hreflang during migration can cause dramatic ranking losses in non-primary markets that take months to recover.

Putting It All Together: An Implementation Checklist

  1. Audit your current state: Crawl your site to identify all existing hreflang annotations and errors
  2. Choose one implementation method: HTML link tags for small sites, XML sitemaps for large sites, HTTP headers for non-HTML resources
  3. Use correct ISO codes: Validate every language and region code against ISO 639-1 and ISO 3166-1
  4. Ensure bidirectional links: Every page must reference all its alternates, and every alternate must reference it back
  5. Add x-default: Point it to your primary market page or a language selector that returns 200
  6. Self-canonicalize: Every language variant must have a canonical tag pointing to itself
  7. Validate before launch: Use CrawlBeast to crawl and verify all hreflang pairs
  8. Monitor continuously: Check Google Search Console for hreflang errors weekly, and use LogBeast to monitor Googlebot crawl distribution across language segments

See it in action with GetBeast tools

Analyze your server logs for Googlebot crawl patterns and validate hreflang with a full-site crawl.

Try LogBeast Free Try CrawlBeast Free