// Glossary
GEO glossary
Plain-English definitions of the terms you'll keep running into when working on generative engine optimisation. Linkable, structured, and AI-readable.
Core concepts
- GEO
- Generative Engine Optimisation. The practice of helping AI search and answer engines crawl, understand, trust and cite a website.
- AEO
- Answer Engine Optimisation. Often used interchangeably with GEO; emphasises optimising for systems that return synthesised answers rather than link lists.
- Generative engine
- Any system that produces a synthesised, natural-language answer from web content – ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews, Copilot.
- Answer engine
- A search interface that returns answers rather than ranked links. Often built on top of a generative engine.
- Citation readiness
- How easily a generative engine can attribute a specific piece of information to your website. Driven by schema, clarity, authorship and accessible content.
- Entity
- A distinct thing the web describes – a company, person, product, place. Generative engines reason about entities, not just keywords.
AI crawlers
- GPTBot
- OpenAI's primary web crawler, used to discover and ingest content. Controlled via robots.txt User-agent: GPTBot.
- ChatGPT-User
- OpenAI's live-fetch agent. Activates when a ChatGPT user requests a specific URL during a conversation.
- OAI-SearchBot
- OpenAI's crawler powering ChatGPT's search experience.
- ClaudeBot
- Anthropic's training crawler for Claude.
- Claude-Web
- Anthropic's live retrieval agent that fetches pages on behalf of a Claude user.
- PerplexityBot
- Perplexity's index crawler.
- Perplexity-User
- Perplexity's live fetch agent for user-initiated queries.
- Google-Extended
- A robots.txt token Google honours specifically to opt out of generative training, distinct from regular Googlebot.
- Applebot-Extended
- Apple's equivalent opt-out token for Apple Intelligence training.
- CCBot
- Common Crawl's bot. Common Crawl is widely used as a training source by many AI providers.
Files and standards
- robots.txt
- A plain-text file at the site root that tells crawlers which paths they may or may not access. Honoured by all major AI bots.
- sitemap.xml
- An XML index of canonical URLs that helps crawlers discover content.
- llms.txt
- A proposed Markdown file at the site root summarising key pages for LLM crawlers. Adoption is emerging.
- llms-full.txt
- An optional companion to llms.txt that includes the full text of the listed pages.
- ai.txt
- A separate proposed convention for declaring AI training permissions. Less widely adopted than robots.txt.
Structured data
- Structured data
- Machine-readable markup that explicitly describes entities, relationships and content types.
- JSON-LD
- JavaScript Object Notation for Linked Data. The recommended format for embedding schema.org markup in web pages.
- Schema markup
- Vocabulary from schema.org used to describe Organisations, Articles, Products, FAQs and more.
- Organization schema
- JSON-LD describing a business as an entity – name, url, logo, sameAs, contactPoint.
- FAQPage schema
- JSON-LD describing question-and-answer pairs. Highly extractable by AI engines.
- Product schema
- JSON-LD describing a product – price, availability, brand, ratings – essential for ecommerce GEO.
- Article schema
- JSON-LD describing editorial content – headline, datePublished, author, mainEntityOfPage.
Trust and content
- E-E-A-T
- Experience, Expertise, Authoritativeness, Trustworthiness. Google's framework for evaluating content quality; useful as a GEO heuristic too.
- sameAs
- A schema.org property used to link an entity to its verified profiles elsewhere – LinkedIn, Wikipedia, Crunchbase, Companies House.
- Server-side rendering
- Producing meaningful HTML on the server before it reaches the browser. AI crawlers parse HTML, not arbitrary JavaScript output.
- Hallucination
- When a generative engine fabricates a fact. Strong structured data and clear authorship reduce the chance of being misquoted.
- Ground truth
- Verifiable, source-of-truth content an engine can cite. Your job in GEO is to be that source.
Products and surfaces
- AI Overviews
- Google's generative answer feature that summarises content above traditional search results.
- ChatGPT search
- OpenAI's web-aware search experience inside ChatGPT, powered partly by OAI-SearchBot.
- Perplexity
- An answer engine that synthesises responses with inline citations. Heavy emphasis on source attribution.
- Copilot
- Microsoft's generative assistant, integrated with Bing search results.
