Markdown API

Convert any webpage to clean, structured Markdown. Strip away navigation, ads, and boilerplate to get just the content — perfect for LLM ingestion, AI pipelines, and content processing.

API Version

v3.0 enables smart defaults — auto_proxy, auto_render, and retry are all on by default. The best proxy and rendering strategy is chosen automatically for each target domain.

Endpoint

HTTP

GET https://opengraph.io/api/3.0/markdown/{encoded_url}?app_id=YOUR_APP_ID

Parameters

Path Parameters

Parameter	Type	Description
encoded_url	string	Required. URL-encoded target URL

Query Parameters

Parameter	Type	Default	Description
app_id	string	-	Required. Your API key
only_main_content	boolean	true	Strip navigation, sidebars, and footers — return only the primary content
include_tags	string	-	Comma-separated HTML tags to include (e.g., `article,main,section`)
exclude_tags	string	-	Comma-separated HTML tags to exclude (e.g., `nav,footer,aside`)
cache_ok	boolean	true	Allow cached results
accept_lang	string	en-US	Language header for localized content
auto_proxy	boolean	true	Automatically select optimal proxy
auto_render	boolean	true	Automatically enable JS rendering when needed
retry	boolean	true	Auto-retry failed requests with proxy escalation
format	string	`markdown`	Response format. `markdown` returns raw text with `Content-Type: text/markdown`. `json` returns a structured JSON envelope with metadata, token estimate, truncation info, and AI safety output.
max_chars	integer	-	Optional character cap on the returned Markdown. Clamped to 500–1,000,000. When the output is truncated, raw mode sets `X-Content-Truncated: true` and `X-Content-Max-Chars`; JSON mode populates `usage.truncated` and `usage.max_characters`.
ai_sanitize	boolean	false	Run an AI safety pass before conversion to help detect and handle common prompt-injection patterns. Available on paid plans only. Free-tier requests with this flag set return a 402 response.
ai_sanitize_mode	string	`sanitize`	Controls what happens when suspicious content is detected. `sanitize` — return cleaned Markdown and a safety summary. `warn` — return original Markdown with risk details in headers or JSON. `block` — return a 422 error for high-risk pages.

Example Request

curl "https://opengraph.io/api/3.0/markdown/https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FWeb_scraping?app_id=YOUR_APP_ID"

Raw Markdown Response (default)

When format=markdown (default), the response is plain text with Content-Type: text/markdown.

Response headers

Content-Type: text/markdown
X-BILLING-REQUESTS: 1
# Only present when max_chars was requested:
X-Content-Truncated: false
# Only present when truncation fired:
X-Content-Max-Chars: 5000
# Only present when ai_sanitize=true:
X-AI-Safety-Enabled: true
X-AI-Safety-Mode: sanitize
X-AI-Safety-Action: sanitized
X-AI-Safety-Risk-Score: 0.12
X-AI-Safety-Risk-Level: low
X-AI-Safety-Content-Modified: false
X-AI-Safety-Signals:

JSON Response (format=json)

Response

{
  "success": true,
  "url": "https://example.com/article",
  "final_url": "https://example.com/article",
  "title": "Example Article Title",
  "description": "A short description from the page meta tags.",
  "canonical_url": "https://example.com/article",
  "metadata": {
    "image": "https://example.com/og-image.jpg",
    "site_name": "Example Site",
    "language": "en"
  },
  "markdown": "# Example Article\n\nMain content here...",
  "usage": {
    "characters": 4820,
    "tokens_estimated": 1205,
    "truncated": false
  },
  "extraction": {
    "only_main_content": true,
    "full_render": false,
    "auto_render": true,
    "auto_render_used": null,
    "proxy_used": false,
    "cache_hit": false
  },
  "ai_safety": {
    "enabled": false,
    "mode": null,
    "action": "not_run",
    "risk_score": null,
    "risk_level": null,
    "content_modified": false,
    "signals": [],
    "removed_elements_count": 0,
    "recommendation": null
  }
}

Plan Gate Error (402)

When ai_sanitize=true is sent by a free-tier caller, the endpoint returns a 402 before any fetching or conversion occurs:

402 Response

{
  "success": false,
  "feature": "ai_sanitize",
  "error": {
    "code": "ai_sanitize_plan_required",
    "message": "AI safety sanitization is available on paid plans."
  },
  "upgrade_required": true
}

The feature field is at the top level so client code can branch on it without parsing the error code string. Existing clients that only handle 400/401/403/422 should add 402 handling.

Use Cases

LLM and AI content ingestion pipelines
RAG (Retrieval-Augmented Generation) data preparation
Content migration between platforms
Documentation scraping and archival
Clean text extraction for NLP processing
Safe ingestion of untrusted external pages with AI safety sanitization