This is the first Optimize AEO research protocol: a repeatable study for tracking which page types answer engines cite when users ask AEO-related questions.
The question is simple: when answer engines produce sourced answers about answer-engine optimization, AI crawler controls, llms.txt, citation tracking, and AEO tools, what kinds of pages get cited?
Research question
Which page types get cited by answer engines for AEO prompts: glossary pages, comparison pages, tool pages, guide pages, methodology pages, official documentation, forums, or competitor blog posts?
Hypotheses
- Comparison prompts will favor comparison pages and strong explanatory guides.
- Crawler-policy prompts will favor official documentation and pages that distinguish crawler purposes.
- Tool prompts will favor pages with a usable tool, not just a listicle.
- Definition prompts may cite glossary, hub, or guide pages depending on how directly the page answers the term.
- Pages with evidence, tables, and clean headings should be easier to cite than pages with broad prose only.
Why this matters
Most AEO advice still treats citation visibility as a black box. But if answer engines consistently cite certain page types for certain prompt families, we can build better source clusters. For example, comparison prompts may prefer comparison pages, crawler prompts may prefer official docs, and tool prompts may prefer utility pages with clear descriptions.
Prompt families
| Family | Example prompt | Expected source type |
|---|---|---|
| Definition | What is answer engine optimization? | Glossary, guide, methodology |
| Comparison | AEO vs SEO: what is the difference? | Comparison page |
| Crawler policy | Should I allow GPTBot or OAI-SearchBot? | Official docs, crawler comparison |
| Tool search | Free local tools for AEO audits | Tool pages |
| Implementation | How do I make a page more likely to get cited? | Guide, checklist, methodology |
Engines to test
- ChatGPT search or source-enabled ChatGPT experiences
- Perplexity
- Google AI Overviews or AI Mode when available
- Claude search-enabled experiences when available
- Microsoft Copilot
Prompt set v1
| Intent | Prompt |
|---|---|
| Definition | What is answer engine optimization? |
| Comparison | AEO vs SEO: what changes for content teams? |
| Crawler policy | What is the difference between GPTBot and OAI-SearchBot? |
| Source map | What is the difference between llms.txt and robots.txt? |
| Tool | What free local tools can help with AEO? |
| Implementation | How do I make a page more likely to get cited by answer engines? |
| Measurement | How should I track AI citations? |
| Passage retrieval | Why do clear sections matter for answer-engine retrieval? |
Fields to log
Each observation should record engine, prompt, answer summary, cited URL, citation surface, result type, competing source, date, and notes. The AI Citation Tracker uses this same structure so observations can be exported as CSV.
Expected outputs
- A table of cited domains and URLs.
- A breakdown by page type.
- A list of source patterns that appear repeatedly.
- Examples of exact citations, wrong-page citations, and competitor citations.
- Recommendations for which Optimize AEO pages should be strengthened next.
Decision rules
If a prompt family repeatedly cites official documentation, we should build pages that explain the official source rather than trying to outrank it with unsupported claims. If a prompt family cites competitors, we should inspect what source pattern the competitor page has that ours lacks. If no visible citations appear, we should record that as a surface behavior instead of forcing a conclusion.
A result becomes actionable when it tells us what to build next: a deeper guide, a stronger comparison page, a better tool landing page, a methodology note, a crawler audit, or a new glossary cluster.
Limitations
This study will not prove universal ranking rules. Answer engines change, availability differs by user and region, and source surfaces vary by product. The study is designed to find practical patterns, not permanent laws.
Status
The protocol is live. The next step is to run the first prompt panel, export observations with the citation tracker, and publish a results page with the dataset summary.