Skip to content
Protocols

Markdown Endpoints for AI Agents — WordPress Content Access

6 min read


Why This Matters

AI crawlers parse HTML poorly. Navigation menus, sidebars, footers, cookie banners — it’s all noise that dilutes the actual content. When an AI agent requests a page, it has to strip away 80% of the markup just to find your words.

Markdown solves this. Citelayer® gives every published post and page a clean Markdown endpoint. AI agents get structured content instantly — no scraping, no parsing, no guesswork. Your content arrives as readable text with proper headings, links, and formatting intact.

How It Works

Citelayer® makes your content available as Markdown through three access methods. Use whichever fits your workflow.

1. The .md Extension

Append .md to any post or page URL:

https://yoursite.com/getting-started/     → HTML (normal page)
https://yoursite.com/getting-started.md    → Markdown output

This works for any published post, page, or custom post type within scope.

2. Query Parameter

Add ?format=markdown to any post URL:

https://yoursite.com/getting-started/?format=markdown

3. Content Negotiation

Send an Accept: text/markdown HTTP header. Citelayer® detects the header and responds with Markdown instead of HTML:

curl -H "Accept: text/markdown" https://yoursite.com/getting-started/

Discovery via HTML Head

Citelayer® adds a <link> tag to the HTML <head> of every in-scope page, pointing AI agents to the Markdown version:

<link rel="alternate" type="text/markdown" href="https://yoursite.com/getting-started.md" />

AI agents that check for alternate content types find the Markdown endpoint automatically.

Conversion Engine

Citelayer® uses the league/html-to-markdown library (bundled with the plugin) to convert your rendered HTML content into clean Markdown. If the library encounters an edge case, a fallback regex converter handles the conversion. You don’t need to install anything — it works out of the box.

Markdown Output Structure

Every Markdown response follows this format:

---
title: Getting Started with Citelayer — Your Site Name
url: https://yoursite.com/getting-started/
date: 2026-02-24
---

# Getting Started with Citelayer

Your post content converted to clean Markdown. Headings, links, lists,
bold text, and code blocks are preserved. Navigation, sidebars, footers,
and other theme elements are stripped out.

## Subheadings Stay Intact

Content under subheadings flows naturally...

- List items convert properly
- Links stay clickable: [Related Post](https://yoursite.com/related/)

> Blockquotes are preserved too.

The YAML frontmatter gives AI agents structured metadata — the page title, canonical URL, and publication date — without parsing the content body.

WooCommerce Support

If WooCommerce is active, product pages include a structured details table in the Markdown output:

---
title: Premium Widget — Your Store
url: https://yoursite.com/product/premium-widget/
date: 2026-02-20
---

# Premium Widget

| Detail         | Value          |
|----------------|----------------|
| Price           | $49.99         |
| Regular Price   | $69.99         |
| Availability    | In Stock       |
| SKU             | WDG-PRE-001    |
| Categories      | Widgets, Tools |

Full product description in clean Markdown...

The details table only appears for WooCommerce products and includes sale pricing when applicable.

HTTP Headers

Citelayer® sends specific headers with every Markdown response:

  • Content-Type: text/markdown; charset=utf-8 — Identifies the response as Markdown.
  • Vary: Accept — Tells caches that the response varies by Accept header. Essential for correct CDN and proxy behavior.
  • X-Markdown-Tokens: <count> — Approximate token count for the content (calculated at ~4 characters per token). Useful for AI agents managing context windows.
  • X-Citelayer-Version: 0.3.1 — The Citelayer® plugin version that generated the response.
  • Content-Signal: ai-train=no, search=yes, ai-input=yes — Signals how the content may be used. Values are configurable in Citelayer® settings.

The X-Markdown-Tokens header lets AI agents estimate content size before processing the body — helpful when assembling context from multiple pages.

Configuration

Navigate to Settings → Citelayer → Markdown in your WordPress admin.

Scope Control

Citelayer® offers two modes that determine which content gets Markdown endpoints:

  • Sitemap mode (default when an SEO plugin is active) — Respects your existing SEO settings. If a post is marked noindex via Rank Math or Yoast, it won’t get a Markdown endpoint. Post types excluded from your SEO plugin’s sitemap are also excluded from Markdown.
  • Manual mode — Select exactly which post types receive Markdown endpoints and exclude specific posts by ID.

Sitemap Mode Details

In sitemap mode, Citelayer® checks two things for each piece of content:

  1. Per-post noindex — If Rank Math, Yoast, or another supported SEO plugin marks a specific post as noindex, Citelayer® excludes it from Markdown output.
  2. Post-type sitemap configuration — If your SEO plugin excludes an entire post type from the sitemap (e.g., “Testimonials” excluded in Rank Math), Citelayer® mirrors that setting.

This keeps your Markdown scope consistent with your SEO strategy — content you hide from search engines stays hidden from AI agents too.

Manual Mode Details

In manual mode, you control scope directly:

  • Post types: Select which post types (posts, pages, products, custom types) get Markdown endpoints.
  • Exclude list: Enter comma-separated post IDs to exclude specific content. Useful for landing pages or gated content you want to keep HTML-only.
Excluded Post IDs: 142, 587, 1203

Verify Your Setup

After activating Citelayer®, test the Markdown endpoints:

  1. Pick any published post and append .md to its URL. Visit it in your browser.
  2. Check the frontmatter. You should see YAML metadata (title, URL, date) between --- delimiters at the top.
  3. Verify the content. The body should contain clean Markdown — no HTML tags, no navigation elements, no sidebar content.
  4. Inspect response headers. Open browser dev tools (Network tab), reload the .md URL, and confirm X-Markdown-Tokens is present in the response headers.

You can also verify with curl to see headers and content together:

curl -i https://yoursite.com/any-published-post.md

Expected response headers:

HTTP/1.1 200 OK
Content-Type: text/markdown; charset=utf-8
Vary: Accept
X-Markdown-Tokens: 847
X-Citelayer-Version: 0.3.1
Content-Signal: ai-train=no, search=yes, ai-input=yes

Technical Details

Caching

Citelayer® caches each Markdown conversion as a WordPress transient:

  • Transient key format: citelayer_markdown_{post_id}_{modified_timestamp}
  • TTL: 24 hours
  • Auto-invalidation: Because the post’s last-modified timestamp is part of the cache key, editing a post automatically creates a new cache entry. The old transient expires naturally after 24 hours.

This design means Citelayer® never serves stale content. The moment you update a post, the next Markdown request generates a fresh conversion.

Rewrite Rule

The .md extension is handled by a WordPress rewrite rule:

(.+).md$ → index.php?pagename=$matches[1]&citelayer_format=markdown

This rule is registered on plugin activation. If .md URLs return 404 after installation, visit Settings → Permalinks and click Save Changes to flush rewrite rules.

How citelayer.ai Uses This

The Citelayer® website at citelayer.ai uses Citelayer® itself. Every documentation page, blog post, and feature page is available as Markdown. Try it: append .md to any page on citelayer.ai to see the output firsthand.

  • llms.txt — The discovery file that links to your Markdown index
  • UCP Discovery — Machine-readable capability declarations for your site
  • Schema REST API — Structured data endpoints that complement Markdown content