// Glossary

GEO glossary

Plain-English definitions of the terms you'll keep running into when working on generative engine optimisation. Linkable, structured, and AI-readable.

Core concepts

GEO
Generative Engine Optimisation. The practice of helping AI search and answer engines crawl, understand, trust and cite a website.
AEO
Answer Engine Optimisation. Often used interchangeably with GEO; emphasises optimising for systems that return synthesised answers rather than link lists.
Generative engine
Any system that produces a synthesised, natural-language answer from web content – ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews, Copilot.
Answer engine
A search interface that returns answers rather than ranked links. Often built on top of a generative engine.
Citation readiness
How easily a generative engine can attribute a specific piece of information to your website. Driven by schema, clarity, authorship and accessible content.
Entity
A distinct thing the web describes – a company, person, product, place. Generative engines reason about entities, not just keywords.

AI crawlers

GPTBot
OpenAI's primary web crawler, used to discover and ingest content. Controlled via robots.txt User-agent: GPTBot.
ChatGPT-User
OpenAI's live-fetch agent. Activates when a ChatGPT user requests a specific URL during a conversation.
OAI-SearchBot
OpenAI's crawler powering ChatGPT's search experience.
ClaudeBot
Anthropic's training crawler for Claude.
Claude-Web
Anthropic's live retrieval agent that fetches pages on behalf of a Claude user.
PerplexityBot
Perplexity's index crawler.
Perplexity-User
Perplexity's live fetch agent for user-initiated queries.
Google-Extended
A robots.txt token Google honours specifically to opt out of generative training, distinct from regular Googlebot.
Applebot-Extended
Apple's equivalent opt-out token for Apple Intelligence training.
CCBot
Common Crawl's bot. Common Crawl is widely used as a training source by many AI providers.

Files and standards

robots.txt
A plain-text file at the site root that tells crawlers which paths they may or may not access. Honoured by all major AI bots.
sitemap.xml
An XML index of canonical URLs that helps crawlers discover content.
llms.txt
A proposed Markdown file at the site root summarising key pages for LLM crawlers. Adoption is emerging.
llms-full.txt
An optional companion to llms.txt that includes the full text of the listed pages.
ai.txt
A separate proposed convention for declaring AI training permissions. Less widely adopted than robots.txt.

Structured data

Structured data
Machine-readable markup that explicitly describes entities, relationships and content types.
JSON-LD
JavaScript Object Notation for Linked Data. The recommended format for embedding schema.org markup in web pages.
Schema markup
Vocabulary from schema.org used to describe Organisations, Articles, Products, FAQs and more.
Organization schema
JSON-LD describing a business as an entity – name, url, logo, sameAs, contactPoint.
FAQPage schema
JSON-LD describing question-and-answer pairs. Highly extractable by AI engines.
Product schema
JSON-LD describing a product – price, availability, brand, ratings – essential for ecommerce GEO.
Article schema
JSON-LD describing editorial content – headline, datePublished, author, mainEntityOfPage.

Trust and content

E-E-A-T
Experience, Expertise, Authoritativeness, Trustworthiness. Google's framework for evaluating content quality; useful as a GEO heuristic too.
Authorship signals
Bylines, author bios, Person schema and consistent attribution that help engines tie content to a real human.
sameAs
A schema.org property used to link an entity to its verified profiles elsewhere – LinkedIn, Wikipedia, Crunchbase, Companies House.
Server-side rendering
Producing meaningful HTML on the server before it reaches the browser. AI crawlers parse HTML, not arbitrary JavaScript output.
Hallucination
When a generative engine fabricates a fact. Strong structured data and clear authorship reduce the chance of being misquoted.
Ground truth
Verifiable, source-of-truth content an engine can cite. Your job in GEO is to be that source.

Products and surfaces

AI Overviews
Google's generative answer feature that summarises content above traditional search results.
Perplexity
An answer engine that synthesises responses with inline citations. Heavy emphasis on source attribution.
Copilot
Microsoft's generative assistant, integrated with Bing search results.