SEO and Metadata Optimization

1. Essential Meta Tags and Document Info

Meta Tag Syntax Purpose Required
charset <meta charset="UTF-8"> Character encoding (must be in first 1024 bytes) ✅ Yes
viewport <meta name="viewport" content="width=device-width, initial-scale=1.0"> Responsive design, mobile optimization ✅ Yes (for mobile)
description <meta name="description" content="..."> Page summary in search results (150-160 chars) ✅ Highly recommended
keywords <meta name="keywords" content="..."> Search keywords DEPRECATED ❌ No (ignored by Google)
author <meta name="author" content="..."> Document author information ❌ Optional
robots <meta name="robots" content="index, follow"> Search engine crawling instructions ❌ Optional
theme-color <meta name="theme-color" content="#ffffff"> Browser UI color (mobile) ❌ Optional
Robots Values Meaning Use Case
index Allow indexing page in search results Default - public content
noindex Prevent indexing (hide from search) Private pages, duplicates, staging
follow Follow links on page Default - pass link equity
nofollow Don't follow links User-generated content, paid links
noarchive Don't cache page Frequently updated content
nosnippet Don't show text snippet in results Control preview display
noimageindex Don't index images Protect image content
Link Tag Syntax Purpose
icon/favicon <link rel="icon" href="favicon.ico"> Browser tab icon, bookmark icon
apple-touch-icon <link rel="apple-touch-icon" sizes="180x180" href="icon.png"> iOS home screen icon
manifest <link rel="manifest" href="/manifest.json"> PWA manifest for app info
alternate <link rel="alternate" hreflang="es" href="..."> Language/regional variants

Example: Complete SEO-optimized head section

<!DOCTYPE html>
<html lang="en">
<head>
  <!-- Character encoding (must be first) -->
  <meta charset="UTF-8">
  
  <!-- Viewport for responsive design -->
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  
  <!-- Page title (50-60 chars, unique per page) -->
  <title>Best Practices for HTML SEO | Complete Guide 2025</title>
  
  <!-- Meta description (150-160 chars, unique per page) -->
  <meta name="description" content="Learn essential HTML SEO best practices including meta tags, structured data, and optimization techniques to improve search rankings.">
  
  <!-- Author information -->
  <meta name="author" content="John Doe">
  
  <!-- Robots directives (optional - default is index,follow) -->
  <meta name="robots" content="index, follow, max-image-preview:large">
  
  <!-- Specific bot directives -->
  <meta name="googlebot" content="index, follow">
  
  <!-- Theme color for mobile browsers -->
  <meta name="theme-color" content="#ffffff">
  
  <!-- Favicons -->
  <link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png">
  <link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png">
  <link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon.png">
  
  <!-- PWA Manifest -->
  <link rel="manifest" href="/manifest.json">
  
  <!-- Language alternatives -->
  <link rel="alternate" hreflang="en" href="https://example.com/en/">
  <link rel="alternate" hreflang="es" href="https://example.com/es/">
  <link rel="alternate" hreflang="x-default" href="https://example.com/">
  
  <!-- Canonical URL (prevent duplicate content) -->
  <link rel="canonical" href="https://example.com/seo-guide">
</head>
<body>
  <!-- Content here -->
</body>
</html>
Best Practices: Keep title under 60 characters. Description should be 150-160 chars, unique per page. Use UTF-8 encoding. Include viewport meta for mobile. Don't use keywords meta (ignored). Use semantic HTML for better crawling. Title and description are most important for SEO.

2. Open Graph Protocol Implementation

Property Required Example Purpose
og:title ✅ Yes "The Best HTML Guide" Title when shared (can differ from <title>)
og:type ✅ Yes "article", "website", "video" Content type classification
og:url ✅ Yes "https://example.com/page" Canonical URL for this page
og:image ✅ Yes "https://example.com/image.jpg" Preview image (1200x630px recommended)
og:description ❌ Optional "Article summary..." Description when shared
og:site_name ❌ Optional "My Website" Name of overall site
og:locale ❌ Optional "en_US" Content language/region
Article Properties Syntax Purpose
article:published_time <meta property="article:published_time" content="2025-12-22T10:00:00Z"> Publication date (ISO 8601)
article:modified_time <meta property="article:modified_time" content="2025-12-23T15:30:00Z"> Last modified date
article:author <meta property="article:author" content="https://facebook.com/author"> Author profile URL
article:section <meta property="article:section" content="Technology"> Content category/section
article:tag <meta property="article:tag" content="HTML"> Keywords/tags (multiple allowed)
Image Properties Purpose Recommended
og:image:width Image width in pixels 1200px
og:image:height Image height in pixels 630px
og:image:alt Alt text for image Descriptive text
og:image:type MIME type image/jpeg, image/png

Example: Complete Open Graph implementation

<head>
  <!-- Basic Open Graph (required) -->
  <meta property="og:title" content="Complete Guide to HTML SEO in 2025">
  <meta property="og:type" content="article">
  <meta property="og:url" content="https://example.com/html-seo-guide">
  <meta property="og:image" content="https://example.com/images/seo-guide-share.jpg">
  
  <!-- Optional but recommended -->
  <meta property="og:description" content="Learn everything about HTML SEO including meta tags, structured data, and optimization techniques.">
  <meta property="og:site_name" content="Web Development Hub">
  <meta property="og:locale" content="en_US">
  <meta property="og:locale:alternate" content="es_ES">
  
  <!-- Image metadata -->
  <meta property="og:image:width" content="1200">
  <meta property="og:image:height" content="630">
  <meta property="og:image:alt" content="HTML SEO Guide Cover Image">
  <meta property="og:image:type" content="image/jpeg">
  
  <!-- Article-specific metadata -->
  <meta property="article:published_time" content="2025-12-22T10:00:00Z">
  <meta property="article:modified_time" content="2025-12-23T15:30:00Z">
  <meta property="article:author" content="https://facebook.com/johndoe">
  <meta property="article:section" content="Web Development">
  <meta property="article:tag" content="HTML">
  <meta property="article:tag" content="SEO">
  <meta property="article:tag" content="Web Development">
  
  <!-- Facebook App ID (for insights) -->
  <meta property="fb:app_id" content="123456789">
</head>

Example: Different content types

<!-- Video content -->
<meta property="og:type" content="video.movie">
<meta property="og:video" content="https://example.com/video.mp4">
<meta property="og:video:type" content="video/mp4">
<meta property="og:video:width" content="1280">
<meta property="og:video:height" content="720">

<!-- Music content -->
<meta property="og:type" content="music.song">
<meta property="music:duration" content="240">
<meta property="music:musician" content="https://example.com/artist">

<!-- Product -->
<meta property="og:type" content="product">
<meta property="product:price:amount" content="29.99">
<meta property="product:price:currency" content="USD">

<!-- Profile/Person -->
<meta property="og:type" content="profile">
<meta property="profile:first_name" content="John">
<meta property="profile:last_name" content="Doe">
<meta property="profile:username" content="johndoe">
Testing: Use Facebook Sharing Debugger (developers.facebook.com/tools/debug/) to test Open Graph tags. LinkedIn also uses OG tags. Image should be at least 1200x630px (1.91:1 ratio). URL must be absolute, not relative. Content in property attribute, not name.

3. Twitter Card Metadata

Twitter Card Type Use Case Required Tags
summary Default card with small image title, description, image (120x120 min)
summary_large_image Large image card (most common) title, description, image (300x157 min, 2:1 ratio)
app Mobile app promotion app name, ID, country
player Video/audio player player URL, width, height
Twitter Meta Tag Syntax Purpose
twitter:card <meta name="twitter:card" content="summary_large_image"> Card type selection
twitter:site <meta name="twitter:site" content="@username"> Site's Twitter handle
twitter:creator <meta name="twitter:creator" content="@author"> Content author's Twitter handle
twitter:title <meta name="twitter:title" content="..."> Title (falls back to og:title)
twitter:description <meta name="twitter:description" content="..."> Description (falls back to og:description)
twitter:image <meta name="twitter:image" content="..."> Image URL (falls back to og:image)
twitter:image:alt <meta name="twitter:image:alt" content="..."> Image alt text for accessibility

Example: Twitter Card with fallback to Open Graph

<head>
  <!-- Open Graph tags (used by Twitter as fallback) -->
  <meta property="og:title" content="Complete HTML SEO Guide 2025">
  <meta property="og:description" content="Master HTML SEO with meta tags, structured data, and best practices.">
  <meta property="og:image" content="https://example.com/images/share-image.jpg">
  <meta property="og:url" content="https://example.com/seo-guide">
  <meta property="og:type" content="article">
  
  <!-- Twitter-specific tags -->
  <meta name="twitter:card" content="summary_large_image">
  <meta name="twitter:site" content="@mywebsite">
  <meta name="twitter:creator" content="@johndoe">
  
  <!-- Optional: Override OG values for Twitter -->
  <meta name="twitter:title" content="HTML SEO Guide - Twitter Optimized">
  <meta name="twitter:description" content="Everything you need to know about HTML SEO in this comprehensive guide.">
  <meta name="twitter:image" content="https://example.com/images/twitter-card.jpg">
  <meta name="twitter:image:alt" content="Colorful diagram showing HTML SEO concepts">
</head>

Example: Twitter Player Card for video

<head>
  <!-- Player card for video content -->
  <meta name="twitter:card" content="player">
  <meta name="twitter:site" content="@myvideosite">
  <meta name="twitter:title" content="HTML Tutorial - Part 1">
  <meta name="twitter:description" content="Introduction to HTML basics">
  
  <!-- Player configuration -->
  <meta name="twitter:player" content="https://example.com/player.html">
  <meta name="twitter:player:width" content="1280">
  <meta name="twitter:player:height" content="720">
  
  <!-- Player stream (optional) -->
  <meta name="twitter:player:stream" content="https://example.com/video.mp4">
  <meta name="twitter:player:stream:content_type" content="video/mp4">
  
  <!-- Fallback image -->
  <meta name="twitter:image" content="https://example.com/thumbnail.jpg">
</head>

Example: Twitter App Card

<head>
  <!-- App card for mobile app promotion -->
  <meta name="twitter:card" content="app">
  <meta name="twitter:site" content="@myapp">
  <meta name="twitter:description" content="Download our amazing app!">
  
  <!-- iOS app info -->
  <meta name="twitter:app:name:iphone" content="My App">
  <meta name="twitter:app:id:iphone" content="123456789">
  <meta name="twitter:app:url:iphone" content="myapp://action">
  
  <!-- Android app info -->
  <meta name="twitter:app:name:googleplay" content="My App">
  <meta name="twitter:app:id:googleplay" content="com.example.myapp">
  <meta name="twitter:app:url:googleplay" content="myapp://action">
  
  <!-- iPad app info (optional) -->
  <meta name="twitter:app:name:ipad" content="My App HD">
  <meta name="twitter:app:id:ipad" content="987654321">
  <meta name="twitter:app:url:ipad" content="myapp://action">
</head>
Testing & Best Practices: Use Twitter Card Validator (cards-dev.twitter.com/validator) to test. Twitter falls back to Open Graph if Twitter tags missing. summary_large_image is most popular (2:1 ratio, min 300x157px). Include twitter:image:alt for accessibility. Use name attribute, not property.

4. Structured Data (JSON-LD, Microdata)

Format Syntax Placement Recommendation
JSON-LD <script type="application/ld+json"> In <head> or <body> ✅ Google's preferred format
Microdata itemscope, itemprop attributes Inline with HTML elements ⚠️ Valid but verbose
RDFa vocab, property attributes Inline with HTML elements ❌ Less common
Schema.org Type Use Case Key Properties
Article Blog posts, news articles headline, author, datePublished, image
Product E-commerce items name, image, offers, aggregateRating
Organization Company information name, logo, url, contactPoint, sameAs
Person Individual profiles name, image, jobTitle, worksFor, sameAs
BreadcrumbList Navigation breadcrumbs itemListElement, position, name, item
Recipe Cooking recipes name, recipeIngredient, recipeInstructions
Event Concerts, conferences name, startDate, location, offers
FAQPage FAQ sections mainEntity (Question/Answer pairs)
LocalBusiness Physical businesses name, address, telephone, openingHours

Example: Article structured data (JSON-LD)

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Complete Guide to HTML SEO in 2025",
  "alternativeHeadline": "Master HTML SEO Best Practices",
  "image": [
    "https://example.com/images/article-1x1.jpg",
    "https://example.com/images/article-4x3.jpg",
    "https://example.com/images/article-16x9.jpg"
  ],
  "datePublished": "2025-12-22T08:00:00+00:00",
  "dateModified": "2025-12-23T09:30:00+00:00",
  "author": {
    "@type": "Person",
    "name": "John Doe",
    "url": "https://example.com/author/john-doe",
    "image": "https://example.com/images/john-doe.jpg"
  },
  "publisher": {
    "@type": "Organization",
    "name": "Web Dev Hub",
    "logo": {
      "@type": "ImageObject",
      "url": "https://example.com/logo.png",
      "width": 600,
      "height": 60
    }
  },
  "description": "Learn everything about HTML SEO including meta tags, structured data, and optimization techniques.",
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://example.com/html-seo-guide"
  },
  "articleSection": "Web Development",
  "keywords": ["HTML", "SEO", "Web Development", "Meta Tags"],
  "wordCount": 2500,
  "inLanguage": "en-US"
}
</script>

Example: Product with reviews (JSON-LD)

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Product",
  "name": "Premium Wireless Headphones",
  "image": [
    "https://example.com/products/headphones-front.jpg",
    "https://example.com/products/headphones-side.jpg"
  ],
  "description": "High-quality wireless headphones with noise cancellation",
  "sku": "WH-12345",
  "mpn": "925872",
  "brand": {
    "@type": "Brand",
    "name": "AudioPro"
  },
  "offers": {
    "@type": "Offer",
    "url": "https://example.com/products/headphones",
    "priceCurrency": "USD",
    "price": "299.99",
    "priceValidUntil": "2025-12-31",
    "availability": "https://schema.org/InStock",
    "seller": {
      "@type": "Organization",
      "name": "Example Store"
    },
    "itemCondition": "https://schema.org/NewCondition"
  },
  "aggregateRating": {
    "@type": "AggregateRating",
    "ratingValue": "4.7",
    "reviewCount": "234",
    "bestRating": "5",
    "worstRating": "1"
  },
  "review": [
    {
      "@type": "Review",
      "reviewRating": {
        "@type": "Rating",
        "ratingValue": "5",
        "bestRating": "5"
      },
      "author": {
        "@type": "Person",
        "name": "Jane Smith"
      },
      "datePublished": "2025-12-15",
      "reviewBody": "Excellent sound quality and comfortable to wear!"
    }
  ]
}
</script>

Example: Breadcrumb navigation (JSON-LD)

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "BreadcrumbList",
  "itemListElement": [
    {
      "@type": "ListItem",
      "position": 1,
      "name": "Home",
      "item": "https://example.com"
    },
    {
      "@type": "ListItem",
      "position": 2,
      "name": "Electronics",
      "item": "https://example.com/electronics"
    },
    {
      "@type": "ListItem",
      "position": 3,
      "name": "Headphones",
      "item": "https://example.com/electronics/headphones"
    },
    {
      "@type": "ListItem",
      "position": 4,
      "name": "Wireless Headphones"
      // No 'item' for current page
    }
  ]
}
</script>

Example: FAQ page (JSON-LD)

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is HTML?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "HTML (HyperText Markup Language) is the standard markup language for creating web pages. It describes the structure of a web page semantically."
      }
    },
    {
      "@type": "Question",
      "name": "How do I add meta tags to HTML?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "<p>Meta tags are added in the <code>&lt;head&gt;</code> section of your HTML document:</p><pre>&lt;meta name=&quot;description&quot; content=&quot;Page description&quot;&gt;</pre>"
      }
    },
    {
      "@type": "Question",
      "name": "Why is SEO important?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "SEO helps your website rank higher in search engine results, increasing visibility and organic traffic to your site."
      }
    }
  ]
}
</script>

Example: Microdata format (alternative to JSON-LD)

<!-- Article using Microdata (inline) -->
<article itemscope itemtype="https://schema.org/Article">
  <h1 itemprop="headline">Complete Guide to HTML SEO</h1>
  
  <img itemprop="image" src="article-image.jpg" alt="SEO Guide">
  
  <p>By <span itemprop="author" itemscope itemtype="https://schema.org/Person">
    <span itemprop="name">John Doe</span>
  </span></p>
  
  <meta itemprop="datePublished" content="2025-12-22T08:00:00+00:00">
  <p>Published: <time datetime="2025-12-22">December 22, 2025</time></p>
  
  <div itemprop="articleBody">
    <p>Article content goes here...</p>
  </div>
  
  <div itemprop="publisher" itemscope itemtype="https://schema.org/Organization">
    <meta itemprop="name" content="Web Dev Hub">
    <div itemprop="logo" itemscope itemtype="https://schema.org/ImageObject">
      <meta itemprop="url" content="https://example.com/logo.png">
    </div>
  </div>
</article>
Testing: Use Google's Rich Results Test (search.google.com/test/rich-results) and Schema Markup Validator (validator.schema.org). JSON-LD is easier to maintain than Microdata. Multiple schema types can coexist on one page. Keep structured data in sync with visible content.

5. Canonical URLs and Duplicate Content

Tag/Method Syntax Purpose
Canonical Link <link rel="canonical" href="https://example.com/page"> Specify preferred version of page
Self-referencing Canonical points to itself Prevent parameter-based duplicates
Cross-domain Canonical points to different domain Syndicated content attribution
Duplicate Content Scenario Solution Example
URL Parameters Use canonical to main version ?sort=price, ?page=2 → canonical to base URL
WWW vs non-WWW Canonical + 301 redirect www.example.com → example.com
HTTP vs HTTPS Canonical to HTTPS + redirect http:// → https://
Trailing Slash Choose one version /page/ → /page or vice versa
Paginated Content Self-referencing canonical OR rel="next/prev" Each page canonical to itself
Print/Mobile Versions Canonical to desktop version /article?print=1 → /article
Syndicated Content Canonical to original source Republished article → original publisher

Example: Canonical URL implementations

<!-- Self-referencing canonical (best practice for all pages) -->
<link rel="canonical" href="https://example.com/products/shoes">

<!-- URL with parameters points to clean version -->
<!-- On page: https://example.com/products?category=shoes&sort=price -->
<link rel="canonical" href="https://example.com/products/shoes">

<!-- Paginated content (each page canonical to itself) -->
<!-- On page: https://example.com/blog?page=2 -->
<link rel="canonical" href="https://example.com/blog?page=2">
<link rel="prev" href="https://example.com/blog?page=1">
<link rel="next" href="https://example.com/blog?page=3">

<!-- Mobile version points to desktop -->
<!-- On page: https://m.example.com/article -->
<link rel="canonical" href="https://example.com/article">

<!-- Cross-domain canonical (syndicated content) -->
<!-- On syndicated site -->
<link rel="canonical" href="https://originalpublisher.com/article">

<!-- AMP version points to canonical -->
<!-- On AMP page: https://example.com/article/amp -->
<link rel="canonical" href="https://example.com/article">

Example: Complete duplicate content strategy

<head>
  <!-- Always use absolute URLs -->
  <link rel="canonical" href="https://example.com/seo-guide">
  
  <!-- Consistent protocol (HTTPS) -->
  <!-- NOT: http://example.com/seo-guide -->
  
  <!-- Consistent domain -->
  <!-- NOT: https://www.example.com/seo-guide -->
  
  <!-- Consistent trailing slash policy -->
  <!-- Choose /seo-guide OR /seo-guide/ (be consistent) -->
  
  <!-- Consistent case -->
  <!-- NOT: https://example.com/SEO-Guide -->
  
  <!-- Remove unnecessary parameters -->
  <!-- NOT: https://example.com/seo-guide?ref=twitter -->
</head>
Important Rules: Canonical URL must be absolute (include https://). Only one canonical tag per page. Canonical is a hint, not directive - Google may ignore it. Don't canonical to different content - only to same/similar content. Combine with 301 redirects for moved content.

6. Sitemap and Robot Instructions

File Purpose Location Format
sitemap.xml List all pages for search engines /sitemap.xml (root or submitted via Search Console) XML format
robots.txt Crawler instructions (allow/disallow) /robots.txt (must be in root) Plain text
sitemap index Multiple sitemaps container /sitemap_index.xml XML format
Sitemap Element Required Description
<loc> ✅ Yes Page URL (absolute, max 2048 chars)
<lastmod> ❌ Optional Last modification date (YYYY-MM-DD or ISO 8601)
<changefreq> ❌ Optional Update frequency (always, hourly, daily, weekly, monthly, yearly, never)
<priority> ❌ Optional Relative importance (0.0 to 1.0, default 0.5)
robots.txt Directive Syntax Purpose
User-agent User-agent: * Target specific bots (* = all)
Disallow Disallow: /admin/ Block crawling of path
Allow Allow: /public/ Override disallow (for subdirectories)
Sitemap Sitemap: https://example.com/sitemap.xml Sitemap location
Crawl-delay Crawl-delay: 10 Seconds between requests (not supported by Google)

Example: XML Sitemap (sitemap.xml)

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
        xmlns:news="http://www.google.com/schemas/sitemap-news/0.9">
  
  <!-- Homepage -->
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2025-12-22</lastmod>
    <changefreq>daily</changefreq>
    <priority>1.0</priority>
  </url>
  
  <!-- Blog post -->
  <url>
    <loc>https://example.com/blog/html-seo-guide</loc>
    <lastmod>2025-12-23T10:30:00+00:00</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.8</priority>
    
    <!-- Image extension -->
    <image:image>
      <image:loc>https://example.com/images/seo-guide.jpg</image:loc>
      <image:title>HTML SEO Guide Infographic</image:title>
      <image:caption>Visual guide to HTML SEO</image:caption>
    </image:image>
  </url>
  
  <!-- Product page -->
  <url>
    <loc>https://example.com/products/headphones</loc>
    <lastmod>2025-12-20</lastmod>
    <changefreq>monthly</changefreq>
    <priority>0.7</priority>
  </url>
  
  <!-- Static page -->
  <url>
    <loc>https://example.com/about</loc>
    <lastmod>2025-11-15</lastmod>
    <changefreq>yearly</changefreq>
    <priority>0.5</priority>
  </url>
  
</urlset>

Example: Sitemap Index for multiple sitemaps

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  
  <sitemap>
    <loc>https://example.com/sitemap-posts.xml</loc>
    <lastmod>2025-12-23</lastmod>
  </sitemap>
  
  <sitemap>
    <loc>https://example.com/sitemap-products.xml</loc>
    <lastmod>2025-12-22</lastmod>
  </sitemap>
  
  <sitemap>
    <loc>https://example.com/sitemap-pages.xml</loc>
    <lastmod>2025-12-20</lastmod>
  </sitemap>
  
</sitemapindex>

Example: robots.txt file

# Allow all bots to crawl everything (default)
User-agent: *
Allow: /

# Block admin area
Disallow: /admin/
Disallow: /private/
Disallow: /temp/

# Block specific file types
Disallow: /*.json$
Disallow: /*.xml$
Disallow: /cgi-bin/

# Allow public files in otherwise blocked directory
Allow: /admin/public/

# Sitemap location
Sitemap: https://example.com/sitemap.xml
Sitemap: https://example.com/sitemap-images.xml

# Google-specific bot
User-agent: Googlebot
Crawl-delay: 0
Disallow: /search
Allow: /search/about

# Block bad bots
User-agent: BadBot
Disallow: /

# Block all bots from staging
User-agent: *
Disallow: /staging/
<head>
  <!-- Optional: Link to sitemap in HTML -->
  <link rel="sitemap" type="application/xml" 
        title="Sitemap" href="/sitemap.xml">
</head>
Best Practices: Max 50,000 URLs per sitemap file (50MB uncompressed). Use sitemap index for larger sites. Update sitemap when content changes. Submit sitemap to Google Search Console and Bing Webmaster Tools. robots.txt doesn't prevent indexing (use noindex meta tag for that). Sitemap helps discovery but doesn't guarantee indexing.

Section 13 Key Takeaways

  • Title (50-60 chars) and meta description (150-160 chars) are most important for SEO; unique per page
  • Use UTF-8 charset and viewport meta for mobile; robots meta controls indexing (index/noindex, follow/nofollow)
  • Open Graph (og:) tags control social media sharing; required: title, type, url, image (1200x630px)
  • Twitter Cards use name attribute (not property); falls back to Open Graph if Twitter tags missing
  • JSON-LD is Google's preferred structured data format; use schema.org types (Article, Product, Organization)
  • Test structured data with Google Rich Results Test; keep data in sync with visible content
  • Canonical URLs prevent duplicate content issues; always use absolute URLs (https://)
  • Self-referencing canonical on all pages; use for URL parameters, pagination, mobile versions
  • sitemap.xml lists all pages (max 50k URLs); robots.txt controls crawler access (must be in root)
  • Submit sitemaps to Search Console; robots.txt blocks crawling, noindex meta prevents indexing