API ReferencePage Content API

Content API

The Content API allows you to extract structured data from any webpage, including metadata, links, and raw HTML.

Endpoint

https://api.capturekit.dev/content

Example Request

GET https://api.capturekit.dev/content?access_key=<your-access-key>&url=https://capturekit.dev

Response

{
	"success": true,
	"data": {
		"metadata": {
			"title": "CaptureKit - Turn any website into a screenshot with our powerful Screenshot API",
			"description": "CaptureKit is a powerful API for capturing screenshots, extracting HTML, gathering links, and summarizing content—all with a simple request.",
			"favicon": "https://capturekit.dev/favicon.ico",
			"ogImage": "https://capturekit-assets.s3.amazonaws.com/capturekit-og+(1).png"
		},
		"links": {
			"internal": [
				"https://capturekit.dev/",
				"https://capturekit.dev/dashboard",
				"https://capturekit.dev/pricing",
				"https://capturekit.dev/blog"
			],
			"external": [
				"https://docs.capturekit.dev",
				"https://zapier.com/apps/capturekit-website-screenshots-p/integrations",
				"https://www.nextupkit.com"
			],
			"social": [
				"https://github.com/CaptureKit-Web-Scraping-API",
				"https://x.com/capturekit"
			]
		},
		"html": "<html><body><h1>Hello, world!</h1></body></html>",
		"markdown": "CaptureKit - Turn any website into a screenshot with our powerful Screenshot API...",
		"sitemap": {
			"source": "https://capturekit.dev/sitemap.xml",
			"totalLinks": 3,
			"links": [
				"https://www.capturekit.dev/",
				"https://www.capturekit.dev/page-content"
				"https://www.capturekit.dev/ai"
			]
		}
	}
}

Parameters

url string Required
The URL of the webpage to capture.


access_key string Required
Your API access key. Can be provided via the access_key query parameter, x-access-key header, or request body.


include_html boolean Optional Defaults to false
Include the raw HTML of the webpage in the response.


include_markdown boolean Optional Defaults to false
Include the Markdown of the webpage in the response.


include_html boolean Optional Defaults to false
Include the raw HTML of the webpage in the response.


include_sitemap boolean Optional Defaults to false
Include sitemap data of the webpage in the response. You can either enter your sitemap in the url field, or we will try to find the sitemap of your website URL.


delay number Optional Defaults to 0
Delay in seconds before capturing the screenshot (max 10s).


wait_until string Optional
Define when to capture (networkidle2, load, domcontentloaded, networkidle0).


wait_for_selector string Optional
Wait for a specific element to appear before taking the screenshot.


selector string Optional
Capture a specific element on the page instead of the full viewport.


remove_selectors string Optional
A comma-separated list of elements to hide before capturing (e.g., ads, popups).


proxy string Optional
Specify a proxy server to route your request through. Supports HTTP, HTTPS, and SOCKS5 proxies. Format: http://username:password@proxy.com:PORT. Useful for bypassing geo-restrictions and rotating IPs.


cache boolean Optional Defaults to false
Cache the response.


cache_ttl number Optional Defaults to 2592000
Cache the response for a custom TTL (in seconds). Maximum 2592000 seconds (1 month), minimum 3600 seconds (1 hour).


remove_cookie_banners boolean Optional Defaults to false
Automatically remove cookie banners before capturing.


viewport_width number Optional Defaults to 1280
The width of the browser viewport in pixels.


viewport_height number Optional Defaults to 1024
The height of the browser viewport in pixels.