OpenGraph.io

Content Extraction API

Extract specific HTML elements (titles, headers, paragraphs) in a structured, LLM-ready format. Feed clean web data directly into RAG pipelines, content analysis tools, or AI applications without running your own scraper infrastructure.

Endpoint

HTTP
GET https://opengraph.io/api/1.1/extract/{encoded_url}?app_id=YOUR_APP_ID

Parameters

Path Parameters

ParameterTypeDescription
encoded_urlstringRequired. URL-encoded target URL

Query Parameters

ParameterTypeDefaultDescription
app_idstring-Required. Your API key
html_elementsstringtitle,h1,h2,h3,h4,h5,pComma-separated list of HTML elements to extract
full_renderbooleanfalseEnable JavaScript rendering
cache_okbooleantrueAllow cached results
use_proxybooleanfalseUse standard proxy for protected sites
use_premiumbooleanfalseUse residential proxy
use_superiorbooleanfalseUse mobile proxy (highest success rate)
accept_langstringen-USLanguage header for localized content

Supported HTML Elements

ElementDescription
titlePage title tag
h1, h2, h3, h4, h5, h6Heading elements
pParagraphs
aLinks
liList items
spanSpan elements
divDiv elements

Example Request

curl "https://opengraph.io/api/1.1/extract/https%3A%2F%2Fexample.com?app_id=YOUR_APP_ID&html_elements=title,h1,p"

Example Response

Response
{
  "tags": [
    {
      "tag": "title",
      "innerText": "Example Domain",
      "position": 0
    },
    {
      "tag": "h1",
      "innerText": "Example Domain",
      "position": 1
    },
    {
      "tag": "p",
      "innerText": "This domain is for use in illustrative examples in documents.",
      "position": 2
    },
    {
      "tag": "p",
      "innerText": "You may use this domain in literature without prior coordination.",
      "position": 3
    }
  ],
  "concatenatedText": "Example Domain Example Domain This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination.",
  "requestInfo": {
    "host": "example.com",
    "responseCode": 200
  }
}

Response Fields

FieldDescription
tagsArray of extracted elements with tag name, text content, and position
concatenatedTextAll extracted text joined together (useful for LLM summarization)
requestInfoMetadata about the request

LLM Tip: Use concatenatedText when feeding content to AI models for summarization. It provides clean text without HTML markup.

Use Cases

  • AI/LLM data pipelines – feed clean text to language models
  • Content analysis and summarization
  • SEO content auditing – check heading structure
  • Research and data collection
  • Automated reporting

MCP Tool

This endpoint is available as the Extract Content tool in the OpenGraph MCP Server. Your AI assistant can extract elements directly without writing any code.

Get started with MCP in 2 minutes →

Learn more about MCP integration →

Related