Setting Up Your AI Subdomain — LLM-LD

What is the AI subdomain?

The AI subdomain (ai.yoursite.com) is an optional but powerful part of Layer 1. It's a parallel version of your website that's optimized specifically for AI crawlers and agents — stripped of visual design, navigation clutter, and JavaScript complexity, leaving just pure structured content.

Think of it as "reader mode" for your website — but built for machines instead of humans.

When an AI crawler hits ai.yoursite.com, it finds:

  • Clean HTML with semantic structure
  • Rich Schema.org JSON-LD on every page
  • No ads, popups, cookie banners, or navigation menus
  • Fast load times (no heavy JavaScript frameworks)
  • Consistent, predictable page structure
💡

Is this required? No. You can achieve Layer 1 compliance by adding Schema.org markup to your main site. The AI subdomain is for those who want maximum AI readability without changing their main site's design or architecture.

Main site vs. AI subdomain

Here's what the same page might look like on your main site versus your AI subdomain:

yoursite.com/about

<!DOCTYPE html>
<html>
<head>
  <title>About Us | Acme Dental</title>
  <link rel="stylesheet" href="/css/main.css">
  <script src="/js/analytics.js"></script>
  <script src="/js/chat-widget.js"></script>
</head>
<body>
  <nav>...50 lines of navigation...</nav>
  <div class="hero-banner">...</div>
  <main>
    <h1>About Acme Dental</h1>
    <p>Content buried here...</p>
  </main>
  <footer>...100 lines of footer...</footer>
  <div class="cookie-banner">...</div>
</body>
</html>

ai.yoursite.com/about

<!DOCTYPE html>
<html>
<head>
  <title>About Us | Acme Dental</title>
  <script type="application/ld+json">
    { "@context": "https://schema.org",
      "@type": "AboutPage",
      "mainEntity": {
        "@type": "Dentist",
        "name": "Acme Dental",
        ...complete structured data...
      }
    }
  </script>
</head>
<body>
  <main>
    <h1>About Acme Dental</h1>
    <p>Clear, structured content...</p>
  </main>
</body>
</html>

Same content, dramatically different signal-to-noise ratio. AI systems can parse the right version instantly.


Architecture options

There are several ways to implement an AI subdomain. Choose based on your technical setup and resources:

🔄

Reverse Proxy

Use Nginx, Cloudflare Workers, or similar to intercept requests to ai.yoursite.com, fetch content from your main site, strip unnecessary elements, and inject Schema.org.

✓ No duplicate content to maintain
⚠ Requires server/edge configuration
📄

Static Generation

Build a static site generator that pulls content from your CMS/main site and generates AI-optimized HTML files. Deploy to any static host.

✓ Fast, cheap hosting (Netlify, Vercel, S3)
⚠ Needs rebuild when content changes
🔌

CMS Plugin/Theme

If you use WordPress, Webflow, or similar — create a separate theme or use a plugin that renders AI-optimized versions at the ai. subdomain.

✓ Integrated with existing workflow
⚠ Platform-specific implementation
🖥️

Separate Application

Build a lightweight app that shares your database/API but renders AI-optimized templates. Good for complex sites with custom backends.

✓ Full control over output
⚠ Most development effort

For most sites, the reverse proxy or static generation approaches offer the best balance of effort and results.

🤝

Need help? Setting up an AI subdomain can be technical. If you'd rather have experts handle it, check out our certified partners who offer LLM-LD implementation services.


Step-by-step setup

Here's how to set up an AI subdomain using the static generation approach — the most portable method that works with any hosting setup.

1
Configure DNS

Add a CNAME or A record for the ai subdomain pointing to your hosting provider.

DNS Records
# If using same server as main site: ai.yoursite.com. CNAME yoursite.com. # If using separate hosting (e.g., Netlify): ai.yoursite.com. CNAME your-ai-site.netlify.app.

DNS changes can take up to 48 hours to propagate, though usually it's much faster.

2
Create the page template

Design a minimal HTML template that will be used for all AI subdomain pages. Focus on semantic HTML and Schema.org placement.

template.html
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>{{page.title}} — {{site.name}}</title> <meta name="description" content="{{page.description}}"> <!-- Canonical points to main site --> <link rel="canonical" href="https://yoursite.com{{page.path}}"> <!-- Schema.org JSON-LD --> <script type="application/ld+json"> {{page.schema_json}} </script> <!-- Minimal styling for readability --> <style> body { font-family: system-ui, sans-serif; max-width: 800px; margin: 0 auto; padding: 20px; line-height: 1.6; } h1 { margin-bottom: 0.5em; } </style> </head> <body> <main> <article> <h1>{{page.title}}</h1> {{page.content}} </article> </main> <!-- Link back to main site --> <footer> <p>View this page on our main site: <a href="https://yoursite.com{{page.path}}">yoursite.com{{page.path}}</a> </p> </footer> </body> </html>

Key elements:

  • Canonical tag — Points to the main site to avoid duplicate content issues
  • Schema.org JSON-LD — The structured data, injected per page
  • Minimal styling — Just enough CSS for human readability if someone visits directly
  • Semantic HTML<main>, <article>, proper headings
3
Generate Schema.org for each page

Each page needs appropriate Schema.org markup based on its content type. Here are templates for common page types:

Homepage

Homepage Schema
{ "@context": "https://schema.org", "@type": "LocalBusiness", // or Organization, etc. "name": "Your Business Name", "description": "What you do", "url": "https://yoursite.com", "telephone": "+1-555-123-4567", "address": { ... }, "openingHoursSpecification": [ ... ], "potentialAction": { "@type": "ReadAction", "target": "https://yoursite.com/ai-discovery" } }

Service Page

Service Page Schema
{ "@context": "https://schema.org", "@type": "Service", "name": "Teeth Whitening", "description": "Professional teeth whitening service...", "provider": { "@type": "LocalBusiness", "name": "Acme Dental" }, "areaServed": "Springfield, IL", "offers": { "@type": "Offer", "price": "299", "priceCurrency": "USD" } }

Team/Person Page

Person Page Schema
{ "@context": "https://schema.org", "@type": "Person", "name": "Dr. Jane Smith", "jobTitle": "Lead Dentist", "description": "Pediatric dentistry specialist with 15 years...", "worksFor": { "@type": "LocalBusiness", "name": "Acme Dental" }, "hasCredential": [ { "@type": "EducationalOccupationalCredential", "name": "DDS" } ] }
4
Build your site structure

Mirror your main site's URL structure so crawlers can easily map between them:

ai.yoursite.com/
├── index.html              # Homepage
├── about/
│   └── index.html          # About page
├── services/
│   ├── index.html          # Services listing
│   ├── whitening.html      # Individual service
│   └── implants.html
├── team/
│   ├── index.html          # Team listing
│   └── dr-smith.html
├── contact.html
├── robots.txt              # AI-friendly robots.txt
├── .well-known/
│   └── llm-index.json      # If using Layer 3
└── sitemap.xml

Note: Your AI Discovery Page (/ai-discovery) lives on your main site, not the AI subdomain. It's the bridge that connects your human-facing site to the LLM Disco Network — AI crawlers blocked from your main site are still allowed to access it.

5
Configure robots.txt for both sites

This is where the magic happens. Your main site blocks AI crawlers and directs them to the AI subdomain. Your AI subdomain welcomes AI crawlers but blocks traditional search engines (to avoid duplicate content issues).

Main site robots.txt

yoursite.com/robots.txt
# Traditional search engines - welcome User-agent: Googlebot Allow: / User-agent: Bingbot Allow: / # AI crawlers - go to the AI subdomain instead User-agent: GPTBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: Claude-Web Disallow: / User-agent: Anthropic Disallow: / User-agent: PerplexityBot Disallow: / User-agent: Cohere-ai Disallow: / # Point AI crawlers to the right place # (in the AI Discovery Page, which is allowed) User-agent: * Allow: /ai-discovery Sitemap: https://yoursite.com/sitemap.xml

AI subdomain robots.txt

ai.yoursite.com/robots.txt
# AI crawlers - welcome! User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: Claude-Web Allow: / User-agent: Anthropic Allow: / User-agent: PerplexityBot Allow: / User-agent: Cohere-ai Allow: / # Traditional search engines - please use main site # (canonical tags handle this, but belt and suspenders) User-agent: Googlebot Disallow: / User-agent: Bingbot Disallow: / User-agent: * Disallow: / Sitemap: https://ai.yoursite.com/sitemap.xml

This creates a clean separation: humans and traditional search engines use your main site, AI systems and agents use your AI subdomain. The AI Discovery Page on your main site serves as the bridge — it's allowed for all crawlers and links to both the AI subdomain and the LLM Disco Network.

6
Deploy and verify

Deploy your AI subdomain and verify everything works:

  • Visit ai.yoursite.com in a browser — pages should load with minimal styling
  • Check the page source — Schema.org JSON-LD should be present in the <head>
  • Use Google's Schema Validator to verify your markup
  • Test that canonical tags point correctly to your main site
  • Verify yoursite.com/robots.txt blocks AI crawlers
  • Verify ai.yoursite.com/robots.txt welcomes AI crawlers
  • Ensure your AI Discovery Page (on main site) links to llmdisco.com

Maintenance

Your AI subdomain needs to stay in sync with your main site. Depending on your architecture:

  • Reverse proxy: Automatic — changes flow through in real-time
  • Static generation: Set up a build trigger when content changes (webhook from CMS, scheduled rebuild, etc.)
  • CMS plugin: Usually automatic with the same content database
🔄

Tip: Since traditional search engines are blocked from your AI subdomain, you won't see it in Google Search Console. That's expected. Monitor your main site's Search Console for any issues with the AI Discovery Page.

Common mistakes to avoid

  • Forgetting the canonical tag: Without it, search engines may see duplicate content
  • Inconsistent URLs: Keep URL structure identical between main site and AI subdomain
  • Missing pages: If a page exists on main site, it should exist on AI subdomain
  • Stale content: Set up automated syncing so AI subdomain doesn't fall behind
  • Swapped robots.txt: Double-check that main site blocks AI and AI subdomain welcomes AI — easy to mix up
  • No AI Discovery Page: The ADP is the bridge that connects everything — don't skip it

Ready for more?

The AI subdomain is part of Layer 1. Continue building with Layer 2 (entities) and Layer 3 (llm-index.json).