Improve docs UX for AI agents (llms.txt, on-domain markdown, robots, meta)#176
Merged
Conversation
… meta) - Generate llms.txt and llms-full.txt from the default (English) locale via hooks/emit_markdown.py (the mkdocs-llmstxt plugin is incompatible with mkdocs-static-i18n). - Emit clean per-page Markdown as <page>/index.md for every language, served on-domain. Repoint the page toolbar's View Markdown / Copy / Ask actions at these instead of GitHub raw (which was broken for all 7 translated locales). - Add robots.txt explicitly allowing AI crawlers and referencing the sitemap. - Add site_description plus per-page descriptions so every page now has a <meta name="description"> (was none). - Fix duplicate page-toolbar.js include and the favicon typo (faviron). - Remove repo cruft (get-pip.py, "hold from setup", .DS_Store) and ignore .DS_Store / __pycache__. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Every nav page now has a description: front matter, so each gets a unique <meta name="description"> and a description in llms.txt instead of falling back to the generic site_description.
This was referenced Jun 9, 2026
Contributor
|
Reviewed and built locally — all claims verified (llms.txt/llms-full.txt/robots.txt/sitemap at root, 209/209 pages with meta descriptions, on-domain .md translated and front-matter-stripped, toolbar included once, cruft removed). Pushed a follow-up commit adding Filed two non-blocking follow-ups:
Also noted (out of scope): |
armcconnell
approved these changes
Jun 9, 2026
juan-malbeclabs
added a commit
that referenced
this pull request
Jun 9, 2026
Re-apply the Content Signals declaration (search=yes, ai-input=yes, ai-train=yes) per contentsignals.org / draft-romm-aipref-contentsignals. It was dropped when main (which already added robots.txt via #176) was merged into this branch. Values mirror the existing AI-crawler allowlist.
juan-malbeclabs
added a commit
that referenced
this pull request
Jun 10, 2026
…179) * docs: declare AI content usage preferences via Content Signals in robots.txt Add docs/robots.txt with a Content-Signal directive (search=yes, ai-input=yes, ai-train=yes) per contentsignals.org / draft-romm-aipref-contentsignals, declaring that this public protocol documentation may be indexed, used to ground AI answers, and used for training. Values mirror the site's existing stance of explicitly allowing all AI crawlers (GPTBot, ClaudeBot, Google-Extended, etc.). Also includes the basic allow rules, AI-crawler allowlist, and sitemap reference so robots.txt is complete on its own (main had none). * docs: add Content-Signal directive to robots.txt Re-apply the Content Signals declaration (search=yes, ai-input=yes, ai-train=yes) per contentsignals.org / draft-romm-aipref-contentsignals. It was dropped when main (which already added robots.txt via #176) was merged into this branch. Values mirror the existing AI-crawler allowlist. * docs: publish .well-known agent-discovery files (MCP card + Agent Skills) Add hooks/emit_well_known.py to copy well-known/ into the built site's .well-known/ dir, publishing an MCP Server Card (SEP-1649) and an Agent Skills discovery index (v0.2.0) whose digests are computed at build time. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The docs site is solid for humans but didn't meet current conventions for consumption by AI agents / LLMs. Analysis of the built
site/surfaced concrete gaps (nollms.txt, no on-domain raw Markdown, broken "View Markdown" for translated pages, zero<meta name="description">, norobots.txt, a duplicated toolbar).What changed
/llms.txt(curated, sectioned index with descriptions) and/llms-full.txt(full concatenated content), generated from the default (English) locale.<page>/index.md(all 8 languages), so agents can fetch source Markdown without GitHub..md. The old GitHub-raw link was broken for all 7 translated locales (/es/setup/→ requesteddocs/es/setup.md, a 404; real source isdocs/setup.es.md). Removed the duplicatepage-toolbar.jsinclude that rendered the toolbar twice.docs/robots.txtexplicitly allowing AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, …) and referencing the sitemap.site_description+ per-pagedescriptionfront matter. Every page now has a<meta name="description">(was 0).faviron→favicontypo; removedget-pip.py(2 MB),hold from setup,.DS_Store; ignore.DS_Store/__pycache__.Implementation note
The
mkdocs-llmstxtplugin was the obvious choice but is incompatible withmkdocs-static-i18n— it can't resolve localized page URIs and skips every page. Sollms.txt/llms-full.txtand the per-page Markdown are produced by a single MkDocs hook (hooks/emit_markdown.py) that filters to the default locale. No new CI dependency.Verification
Built locally with
mkdocs build:/llms.txt,/llms-full.txt,/robots.txtpresent at site root.site/setup/index.mdandsite/es/setup/index.mdexist with clean (front-matter-stripped) Markdown, correctly translated.<meta name="description">(pages without their own description fall back tosite_description).page-toolbar.jsincluded exactly once per page (was twice).mkdocs serve: toolbar appears once; "View Markdown" opens the on-domain.mdand works on translated pages.🤖 Generated with Claude Code