Markdown Endpoints for AI Agents — WordPress Content Access
On This Page
Why This Matters
AI crawlers parse HTML poorly. Navigation menus, sidebars, footers, cookie banners — it’s all noise that dilutes the actual content. When an AI agent requests a page, it has to strip away 80% of the markup just to find your words.
Markdown solves this. Citelayer® gives every published post and page a clean Markdown endpoint. AI agents get structured content instantly — no scraping, no parsing, no guesswork. Your content arrives as readable text with proper headings, links, and formatting intact.
How It Works
Citelayer® makes your content available as Markdown through three access methods. Use whichever fits your workflow.
1. The .md Extension
Append .md to any post or page URL:
https://yoursite.com/getting-started/ → HTML (normal page)
https://yoursite.com/getting-started.md → Markdown output
This works for any published post, page, or custom post type within scope.
2. Query Parameter
Add ?format=markdown to any post URL:
https://yoursite.com/getting-started/?format=markdown
3. Content Negotiation
Send an Accept: text/markdown HTTP header. Citelayer® detects the header and responds with Markdown instead of HTML:
curl -H "Accept: text/markdown" https://yoursite.com/getting-started/
Discovery via HTML Head
Citelayer® adds a <link> tag to the HTML <head> of every in-scope page, pointing AI agents to the Markdown version:
<link rel="alternate" type="text/markdown" href="https://yoursite.com/getting-started.md" />
AI agents that check for alternate content types find the Markdown endpoint automatically.
Conversion Engine
Citelayer® uses the league/html-to-markdown library (bundled with the plugin) to convert your rendered HTML content into clean Markdown. If the library encounters an edge case, a fallback regex converter handles the conversion. You don’t need to install anything — it works out of the box.
Markdown Output Structure
Every Markdown response follows this format:
---
title: Getting Started with Citelayer — Your Site Name
url: https://yoursite.com/getting-started/
date: 2026-02-24
---
# Getting Started with Citelayer
Your post content converted to clean Markdown. Headings, links, lists,
bold text, and code blocks are preserved. Navigation, sidebars, footers,
and other theme elements are stripped out.
## Subheadings Stay Intact
Content under subheadings flows naturally...
- List items convert properly
- Links stay clickable: [Related Post](https://yoursite.com/related/)
> Blockquotes are preserved too.
The YAML frontmatter gives AI agents structured metadata — the page title, canonical URL, and publication date — without parsing the content body.
WooCommerce Support
If WooCommerce is active, product pages include a structured details table in the Markdown output:
---
title: Premium Widget — Your Store
url: https://yoursite.com/product/premium-widget/
date: 2026-02-20
---
# Premium Widget
| Detail | Value |
|----------------|----------------|
| Price | $49.99 |
| Regular Price | $69.99 |
| Availability | In Stock |
| SKU | WDG-PRE-001 |
| Categories | Widgets, Tools |
Full product description in clean Markdown...
The details table only appears for WooCommerce products and includes sale pricing when applicable.
HTTP Headers
Citelayer® sends specific headers with every Markdown response:
Content-Type: text/markdown; charset=utf-8— Identifies the response as Markdown.Vary: Accept— Tells caches that the response varies byAcceptheader. Essential for correct CDN and proxy behavior.X-Markdown-Tokens: <count>— Approximate token count for the content (calculated at ~4 characters per token). Useful for AI agents managing context windows.X-Citelayer-Version: 0.3.1— The Citelayer® plugin version that generated the response.Content-Signal: ai-train=no, search=yes, ai-input=yes— Signals how the content may be used. Values are configurable in Citelayer® settings.
The X-Markdown-Tokens header lets AI agents estimate content size before processing the body — helpful when assembling context from multiple pages.
Configuration
Navigate to Settings → Citelayer → Markdown in your WordPress admin.
Scope Control
Citelayer® offers two modes that determine which content gets Markdown endpoints:
- Sitemap mode (default when an SEO plugin is active) — Respects your existing SEO settings. If a post is marked
noindexvia Rank Math or Yoast, it won’t get a Markdown endpoint. Post types excluded from your SEO plugin’s sitemap are also excluded from Markdown. - Manual mode — Select exactly which post types receive Markdown endpoints and exclude specific posts by ID.
Sitemap Mode Details
In sitemap mode, Citelayer® checks two things for each piece of content:
- Per-post noindex — If Rank Math, Yoast, or another supported SEO plugin marks a specific post as
noindex, Citelayer® excludes it from Markdown output. - Post-type sitemap configuration — If your SEO plugin excludes an entire post type from the sitemap (e.g., “Testimonials” excluded in Rank Math), Citelayer® mirrors that setting.
This keeps your Markdown scope consistent with your SEO strategy — content you hide from search engines stays hidden from AI agents too.
Manual Mode Details
In manual mode, you control scope directly:
- Post types: Select which post types (posts, pages, products, custom types) get Markdown endpoints.
- Exclude list: Enter comma-separated post IDs to exclude specific content. Useful for landing pages or gated content you want to keep HTML-only.
Excluded Post IDs: 142, 587, 1203
Verify Your Setup
After activating Citelayer®, test the Markdown endpoints:
- Pick any published post and append
.mdto its URL. Visit it in your browser. - Check the frontmatter. You should see YAML metadata (title, URL, date) between
---delimiters at the top. - Verify the content. The body should contain clean Markdown — no HTML tags, no navigation elements, no sidebar content.
- Inspect response headers. Open browser dev tools (Network tab), reload the
.mdURL, and confirmX-Markdown-Tokensis present in the response headers.
You can also verify with curl to see headers and content together:
curl -i https://yoursite.com/any-published-post.md
Expected response headers:
HTTP/1.1 200 OK
Content-Type: text/markdown; charset=utf-8
Vary: Accept
X-Markdown-Tokens: 847
X-Citelayer-Version: 0.3.1
Content-Signal: ai-train=no, search=yes, ai-input=yes
Technical Details
Caching
Citelayer® caches each Markdown conversion as a WordPress transient:
- Transient key format:
citelayer_markdown_{post_id}_{modified_timestamp} - TTL: 24 hours
- Auto-invalidation: Because the post’s last-modified timestamp is part of the cache key, editing a post automatically creates a new cache entry. The old transient expires naturally after 24 hours.
This design means Citelayer® never serves stale content. The moment you update a post, the next Markdown request generates a fresh conversion.
Rewrite Rule
The .md extension is handled by a WordPress rewrite rule:
(.+).md$ → index.php?pagename=$matches[1]&citelayer_format=markdown
This rule is registered on plugin activation. If .md URLs return 404 after installation, visit Settings → Permalinks and click Save Changes to flush rewrite rules.
How citelayer.ai Uses This
The Citelayer® website at citelayer.ai uses Citelayer® itself. Every documentation page, blog post, and feature page is available as Markdown. Try it: append .md to any page on citelayer.ai to see the output firsthand.
Related Documentation
- llms.txt — The discovery file that links to your Markdown index
- UCP Discovery — Machine-readable capability declarations for your site
- Schema REST API — Structured data endpoints that complement Markdown content