Content API
The Content API allows you to extract structured data from any webpage, including metadata, links, and raw HTML.
Endpoint
https://api.capturekit.dev/content
Example Request
GET https://api.capturekit.dev/content?access_key=<your-access-key>&url=https://capturekit.dev
Response
{
"success": true,
"data": {
"metadata": {
"title": "CaptureKit - Turn any website into a screenshot with our powerful Screenshot API",
"description": "CaptureKit is a powerful API for capturing screenshots, extracting HTML, gathering links, and summarizing content—all with a simple request.",
"favicon": "https://capturekit.dev/favicon.ico",
"ogImage": "https://capturekit-assets.s3.amazonaws.com/capturekit-og+(1).png"
},
"links": {
"internal": [
"https://capturekit.dev/",
"https://capturekit.dev/dashboard",
"https://capturekit.dev/pricing",
"https://capturekit.dev/blog"
],
"external": [
"https://docs.capturekit.dev",
"https://zapier.com/apps/capturekit-website-screenshots-p/integrations",
"https://www.nextupkit.com"
],
"social": [
"https://github.com/CaptureKit-Web-Scraping-API",
"https://x.com/capturekit"
]
},
"html": "<html><body><h1>Hello, world!</h1></body></html>",
"markdown": "CaptureKit - Turn any website into a screenshot with our powerful Screenshot API...",
"sitemap": {
"source": "https://capturekit.dev/sitemap.xml",
"totalLinks": 3,
"links": [
"https://www.capturekit.dev/",
"https://www.capturekit.dev/page-content"
"https://www.capturekit.dev/ai"
]
}
}
}
Parameters
url
string Required
The URL of the webpage to capture.
access_key
string Required
Your API access key. Can be provided via the access_key
query parameter, x-access-key
header, or request body.
include_html
boolean Optional Defaults to false
Include the raw HTML of the webpage in the response.
include_markdown
boolean Optional Defaults to false
Include the Markdown of the webpage in the response.
include_html
boolean Optional Defaults to false
Include the raw HTML of the webpage in the response.
use_defuddle
boolean Optional Defaults to false
Use Defuddle to clean and extract the main content from web pages. This popular library removes unnecessary elements like comments, sidebars, headers, footers, and other non-essential elements, leaving only the primary content. When enabled, the HTML response will be processed through Defuddle before being returned.
delay
number Optional Defaults to 0
Delay in seconds before capturing the screenshot (max 10s
).
wait_until
string Optional
Define when to capture (networkidle2
, load
, domcontentloaded
, networkidle0
).
wait_for_selector
string Optional
Wait for a specific element to appear before taking the screenshot.
selector
string Optional
Capture a specific element on the page instead of the full viewport.
remove_selectors
string Optional
A comma-separated list of elements to hide before capturing (e.g., ads, popups).
block_urls
string Optional
Comma-separated list of URL patterns to block (e.g., “analytics,tracking,advertisement”). You can specify URLs, domains, or simple patterns like “.example.com/”.
block_resources
string Optional
Comma-separated list of resource types to block (e.g., “image,stylesheet,font”). Available resource types:
document
, stylesheet
, image
, media
, font
, script
, texttrack
, xhr
, fetch
, eventsource
, websocket
, manifest
, other
. Useful for optimizing page loading speed before capturing web content.
proxy
string Optional
Specify a proxy server to route your request through. Supports HTTP, HTTPS, and SOCKS5 proxies. Format: http://username:password@proxy.com:PORT
. Useful for bypassing geo-restrictions and rotating IPs.
cache
boolean Optional Defaults to false
Cache the response.
cache_ttl
number Optional Defaults to 2592000
Cache the response for a custom TTL (in seconds). Maximum 2592000
seconds (1 month), minimum 3600
seconds (1 hour).
remove_cookie_banners
boolean Optional Defaults to false
Automatically remove cookie banners before capturing.
viewport_width
number Optional Defaults to 1280
The width of the browser viewport in pixels.
viewport_height
number Optional Defaults to 1024
The height of the browser viewport in pixels.