As a technical founder, when I discovered our company had zero visibility in ChatGPT, I did what any developer would do: I went deep on the technical implementation.
Over six weeks, I evaluated 47 agencies claiming to offer "GEO" (Generative Engine Optimization) services. I asked for their technical architecture, reviewed their codebase approaches, and tested their methodologies.
Spoiler: Most were selling rebranded SEO with zero understanding of how LLMs actually work.
But about 8 of them had legitimate technical chops. Here's what I learned about the actual tech stack behind effective AI search optimization.
The Technical Foundation: What Actually Matters
- Structured Data Implementation (Critical) This is where most agencies failed the technical test. The Question I Asked: "Walk me through your schema.org implementation strategy." Bad Answers (31 agencies): javascript// What they actually did { "@context": "<a href="https://schema.org">https://schema.org</a>", "@type": "Organization", "name": "Company Name" } That's it. Bare minimum Organization schema with no depth. Good Answers (8 agencies): javascript// What actually works for GEO { "@context": "<a href="https://schema.org">https://schema.org</a>", "@type": "Organization", "name": "Company Name", "url": "<a href="https://example.com">https://example.com</a>", "logo": "<a href="https://example.com/logo.png">https://example.com/logo.png</a>", "sameAs": [ "<a href="https://twitter.com/company">https://twitter.com/company</a>", "<a href="https://linkedin.com/company/company">https://linkedin.com/company/company</a>", "<a href="https://github.com/company">https://github.com/company</a>" ], "contactPoint": { "@type": "ContactPoint", "telephone": "+1-XXX-XXX-XXXX", "contactType": "customer service" }, "address": { "@type": "PostalAddress", "streetAddress": "123 Main St", "addressLocality": "City", "addressRegion": "State", "postalCode": "12345", "addressCountry": "US" } }
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is your primary service?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Detailed answer with entities and context..."
}
}
// 50-100 more FAQs
]
}
The Technical Difference:
Comprehensive entity relationships (sameAs for cross-platform validation)
Nested structured data (ContactPoint, PostalAddress)
FAQPage schema with extensive Q&A coverage
Product/Service schema with detailed attributes
Review schema with aggregate ratings
Validation Stack:
bash# Tools that actually matter
- Google Rich Results Test
- Schema.org Validator
- JSON-LD Playground
- Structured Data Linter (custom build)
- The llms.txt File (Emerging Standard) Only 3 out of 47 agencies even knew what this was. What it is: A file at your root domain that tells AI crawlers about your site structure. txt# llms.txt # https://yoursite.com/llms.txt
Company Information
Organization: Company Name
Industry: B2B SaaS
Founded: 2020
Location: San Francisco, CA
Primary Services
- Service 1: Description with entities
- Service 2: Description with entities
- Service 3: Description with entities
Key Content URLs
Main Site: https://yoursite.com
Documentation: https://docs.yoursite.com
Blog: https://yoursite.com/blog
Case Studies: https://yoursite.com/case-studies
Entity Relationships
Wikipedia: https://en.wikipedia.org/wiki/Company_Name
Crunchbase: https://crunchbase.com/company
LinkedIn: https://linkedin.com/company/company-name
Structured Data Endpoints
Schema: https://yoursite.com/schema.json
Sitemap: https://yoursite.com/sitemap.xml
Implementation:
javascript// Express.js middleware
app.get('/llms.txt', (req, res) => {
res.type('text/plain');
res.sendFile(__dirname + '/public/llms.txt');
});
Impact: Early data suggests 15-20% better citation accuracy from LLMs that support this standard.
- Entity Consolidation Architecture The Technical Challenge: AI platforms need to understand that: yourcompany.com === @yourcompany === Your Company Inc. === "Your Company" Bad Approach (Most Agencies): Hope for the best, no systematic consolidation. Good Approach (8 Agencies): javascript// Systematic NAP (Name, Address, Phone) consistency const entityData = { name: "Exact Company Name Inc.", // Never varies address: "123 Main Street, Suite 100, San Francisco, CA 94102", phone: "+1-415-555-0123", email: "contact@company.com", socialHandles: { twitter: "@exacthandle", linkedin: "company/exact-name", github: "exact-org-name" } };
// Used consistently across:
// - Schema.org markup
// - robots.txt
// - llms.txt
// - All social profiles
// - Directory listings
// - Press releases
Validation Script:
python# entity_consistency_checker.py
import requests
from bs4 import BeautifulSoup
import json
def check_entity_consistency(urls):
entities = []
for url in urls:
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
# Extract schema.org data
scripts = soup.find_all('script', type='application/ld+json')
for script in scripts:
data = json.loads(script.string)
if '@type' in data and data['@type'] == 'Organization':
entities.append({
'source': url,
'name': data.get('name'),
'url': data.get('url'),
'address': data.get('address')
})
Check for inconsistencies
names = set(e['name'] for e in entities if 'name' in e)
if len(names) > 1:
print(f"⚠️ Inconsistent names found: {names}")
else:
print(f"✅ Entity name consistent: {names.pop()}")
Usage
urls = [
'https://yoursite.com',
'https://yoursite.com/about',
'https://yoursite.com/contact'
]
check_entity_consistency(urls)
- Semantic HTML Structure
LLMs parse HTML better than humans. Structure matters.
Bad HTML (What Most Sites Have):
html
What is your service?
We provide XYZ service.
Good HTML (What Works for GEO):
html
What is your service?
We provide XYZ service, which helps entities achieve specific outcomes through methodologies.
Key Technical Principles:
Semantic HTML5 tags (, , )
Microdata attributes (itemprop, itemscope, itemtype)
Proper heading hierarchy (H1 → H2 → H3, no skipping)
Descriptive class names (.faq-question vs .q)
Meaningful alt text on images (not keyword stuffing)
- API-First Content Architecture The Problem: Static content ages poorly for AI search (especially DeepSeek, which heavily favors recency). The Solution: Headless CMS with dynamic content injection. javascript// Next.js example with dynamic content import { useState, useEffect } from 'react';
export default function FAQPage() {
const [faqs, setFaqs] = useState([]);
const [lastUpdated, setLastUpdated] = useState(null);
useEffect(() => {
// Fetch from headless CMS
fetch('/api/faqs')
.then(res => res.json())
.then(data => {
setFaqs(data.faqs);
setLastUpdated(data.lastUpdated);
});
}, []);
return (
{faqs.map(faq => (
))}
);
}
Benefits:
Easy content updates (no redeployment)
Automatic "Last Modified" timestamps
A/B testing content for AI optimization
Dynamic schema generation
- Sitemap Optimization for AI Crawlers Standard XML sitemaps aren't enough anymore. Enhanced Sitemap Strategy: xml<?xml version="1.0" encoding="UTF-8"?> https://yoursite.com/important-page 2026-02-06T10:00:00+00:00 weekly 1.0 <!-- AI-specific metadata --> news:news news:publication_date2026-02-06T10:00:00Z/news:publication_date news:titleExact Page Title/news:title /news:news Plus, separate sitemaps:
/sitemap-articles.xml (blog content)
/sitemap-faqs.xml (FAQ pages - critical for GEO)
/sitemap-products.xml (product/service pages)
/sitemap-images.xml (image optimization)
- Performance Metrics That Actually Correlate with AI Citations After analyzing our data and the 8 successful agencies, here are the technical metrics that correlate with AI visibility: javascript// Metrics that matter for GEO const geoMetrics = { // Critical schemaValidationScore: 100, // Must be perfect faqPageCount: 50, // Minimum for meaningful coverage entityConsistency: 100, // Across all platforms
// Important
firstContentfulPaint: 1.2, // seconds (< 1.5s target)
timeToInteractive: 2.8, // seconds (< 3.0s target)
cumulativeLayoutShift: 0.05, // (< 0.1 target)
// Nice to have
structuredDataCoverage: 85, // % of pages with schema
internalLinkDensity: 3.2, // links per 1000 words
semanticKeywordDensity: 2.1 // % (entity-focused)
};
Monitoring Stack:
bash# Technical monitoring for GEO
- Lighthouse CI (automated performance testing)
- Schema.org Validator (automated checking)
- Custom AI query testing (ChatGPT API + Selenium)
- Entity consistency monitoring (custom Python script)
- Structured data change detection (git diff + alerts)
- The Testing Framework Nobody Uses (But Should) Here's how I tested agencies' technical competency: python# ai_visibility_tester.py import openai from anthropic import Anthropic import google.generativeai as genai
class AIVisibilityTester:
def init(self, company_name, test_queries):
self.company_name = company_name
self.test_queries = test_queries
self.results = {
'chatgpt': [],
'claude': [],
'gemini': []
}
def test_chatgpt(self, query):
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[{"role": "user", "content": query}]
)
return self.company_name.lower() in response.choices[0].message.content.lower()
def test_claude(self, query):
anthropic = Anthropic()
response = anthropic.messages.create(
model="claude-3-5-sonnet-20241022",
messages=[{"role": "user", "content": query}]
)
return self.company_name.lower() in response.content[0].text.lower()
def run_full_test(self):
for query in self.test_queries:
self.results['chatgpt'].append(self.test_chatgpt(query))
self.results['claude'].append(self.test_claude(query))
# Calculate citation rates
citation_rate = {
'chatgpt': sum(self.results['chatgpt']) / len(self.results['chatgpt']) * 100,
'claude': sum(self.results['claude']) / len(self.results['claude']) * 100
}
return citation_rate
Usage
tester = AIVisibilityTester(
company_name="YourCompany",
test_queries=[
"best CRM for real estate",
"top project management tools for startups",
"which accounting software should I use"
]
)
results = tester.run_full_test()
print(f"ChatGPT citation rate: {results['chatgpt']}%")
print(f"Claude citation rate: {results['claude']}%")
Run this monthly to track actual progress, not vanity metrics.
The Technical Stack That Actually Worked
After implementing learnings from the best 8 agencies, here's our production stack:
yaml# Frontend
Framework: Next.js 14 (App Router)
CMS: Contentful (headless)
Styling: Tailwind CSS
Deployment: Vercel
Schema Management
Generator: Custom React component
Validation: Automated via GitHub Actions
Storage: Git-tracked JSON files
Monitoring
Performance: Lighthouse CI
Schema: Custom validator (Python)
AI Testing: Weekly automated queries
Uptime: UptimeRobot
Content Pipeline
Writing: Human + AI-assisted
Editing: Human review
Schema: Auto-generated from content
Deployment: Continuous (via git push)
Analytics
Traditional: Google Analytics 4
AI-specific: Custom dashboard (Retool)
Citation tracking: Weekly manual + automated tests
The Results (Technical Proof)
Before Optimization:
bash$ python ai_visibility_tester.py
ChatGPT citation rate: 0%
Claude citation rate: 0%
Gemini citation rate: 0%
After 4 Months:
bash$ python ai_visibility_tester.py
ChatGPT citation rate: 47%
Claude citation rate: 38%
Gemini citation rate: 63%
Perplexity citation rate: 73%
Technical Improvements:
Schema validation score: 45% → 100%
FAQ page count: 3 → 87
Structured data coverage: 12% → 94%
Entity consistency: 67% → 100%
Core Web Vitals: Failed → Passed (all metrics)
Business Impact:
AI-attributed traffic: +340%
Qualified leads from AI: 83 in 4 months
Revenue from AI sources: $340K+
What Most Agencies Get Wrong (Technical Edition)
- They Bolt Schema Onto Existing Sites Wrong Approach: javascript// Adding schema as an afterthought // Hardcoded JSON-LD Right Approach: javascript// Schema as first-class citizen in component architecture export default function ProductPage({ product }) { const schema = generateProductSchema(product);
return (
<>
type="application/ld+json"<br>
dangerouslySetInnerHTML={{ __html: JSON.stringify(schema) }}<br>
/><br>
</Head><br>
<ProductDetails product={product} /><br>
&lt;/&gt;<br>
);<br>
}</p>
<ol>
<li>They Ignore Performance
LLMs favor fast sites. Period.
The Data:</li>
</ol>
<p>Sites <1.5s FCP: 3.2x higher citation rate<br>
Sites >3.0s FCP: 40% lower citation rate</p>
<p>Fix:<br>
javascript// Image optimization example<br>
import Image from 'next/image';</p>
<p>// Before (wrong)<br>
<img src="/hero.jpg" alt="Hero" /></p>
<p>// After (right)<br>
<Image<br>
src="/hero.jpg"<br>
alt="Descriptive, entity-rich alt text"<br>
width={1200}<br>
height={600}<br>
priority<br>
placeholder="blur"<br>
/></p>
<ol>
<li>They Use Generic Content
AI platforms favor specificity, entities, and data.
Generic (doesn't work):
markdownWe offer great services to help businesses grow.
Specific (works):
markdownOur B2B SaaS platform helps mid-market companies ($10M-$100M revenue)
in the healthcare vertical reduce customer acquisition costs by an
average of 23% through AI-driven lead scoring, automated nurture
campaigns, and predictive churn analysis.
Open Source Tools I Built
Since most agencies had inadequate tooling, I built my own:</li>
<li>GEO Schema Validator
bashnpm install -g geo-schema-validator
geo-validate <a href="https://yoursite.com">https://yoursite.com</a></li>
<li>AI Citation Tracker
bashpip install ai-citation-tracker
ai-track --site yoursite.com --queries queries.txt
Both available on GitHub
Recommendations for Developers
If you're implementing GEO yourself:</li>
</ol>
<p>Start with Schema.org coverage - 80%+ of your pages need it<br>
Build FAQ content systematically - Target 50-100 question/answer pairs<br>
<a href="https://digimsm.com/marketing-automation/">Automate entity</a> consistency checking - Don't do this manually<br>
Set up automated AI testing - Weekly queries across platforms<br>
Optimize for performance - Core Web Vitals matter for AI<br>
Use semantic HTML - It's not 2010 anymore, divs aren't enough</p>
<p>If you're hiring an agency:<br>
Ask to see their:</p>
<p>Schema implementation approach (code samples)<br>
Testing methodology (scripts, automation)<br>
Entity consolidation process (technical documentation)<br>
Performance optimization stack (tools, metrics)</p>
<p>If they can't provide these, they're not technically competent enough for GEO.<br>
Full Technical Breakdown<br>
I've documented the complete technical architecture, including code samples, configuration files, and testing frameworks in my <a href="https://medium.com/@msmyaqoob55/finding-the-right-geo-agency-what-i-learned-after-vetting-47-ai-optimization-companies-6c424b8064db">detailed Medium article</a>.<br>
Questions?<br>
Drop them in the comments. I'm actively monitoring and happy to share specific code samples, configuration files, or architectural decisions.</p>
Top comments (0)