Overview
ScrapeStack provides a simple REST API that fetches and renders web pages, returning clean HTML or JSON data. It handles JavaScript rendering, proxy rotation, and anti-bot measures automatically.
Base URL: https://api.scrapestack.dev
Core Capabilities
1. HTML Fetching
Retrieve raw HTML from any publicly accessible webpage.
2. JavaScript Rendering
Modern websites built with React, Vue, Angular, or other JS frameworks are fully rendered before returning content.
3. Anti-Bot Handling
Smart header rotation, request throttling, and retry logic to bypass basic blocking mechanisms.
Request Flow
Lifecycle:
1. Your app sends URL → 2. API receives request → 3. Loads page with JS rendering → 4. Extracts content → 5. Returns response
1. Your app sends URL → 2. API receives request → 3. Loads page with JS rendering → 4. Extracts content → 5. Returns response
API Endpoint
GET https://api.scrapestack.dev/scrape?url={target_url}&render={true/false}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| url | string | Yes | The target webpage URL to scrape |
| render | boolean | No (default: true) | Enable JavaScript rendering |
| apikey | string | Yes | Your API authentication key |
| timeout | integer | No (default: 30) | Request timeout in seconds |
Response Structure
{
"status": "success",
"url": "https://example.com",
"html": "...",
"headers": {
"content-type": "text/html",
"content-length": "1256"
},
"status_code": 200,
"timestamp": "2024-01-01T00:00:00Z"
}
| Field | Description |
|---|---|
| status | "success" or "error" |
| url | The requested URL |
| html | The rendered HTML content |
| status_code | HTTP status code of the request |
| error | Error message (if status is "error") |
Code Examples
cURL
curl -X GET "https://api.scrapestack.dev/scrape?url=https://example.com&apikey=YOUR_KEY"
JavaScript (Node.js)
const fetch = require('node-fetch');
const response = await fetch(
'https://api.scrapestack.dev/scrape?url=https://example.com&apikey=YOUR_KEY'
);
const data = await response.json();
console.log(data.html);
Python
import requests
response = requests.get(
'https://api.scrapestack.dev/scrape',
params={'url': 'https://example.com', 'apikey': 'YOUR_KEY'}
)
data = response.json()
print(data['html'])
Limitations (Transparent & Honest)
Please Note:
• No guarantee against all anti-bot systems (Cloudflare, etc.)
• Rate limits apply (100 requests/hour on free tier)
• Very large pages (>5MB) may timeout
• Some sites require specific headers we don't support yet
• Not for bypassing authentication or accessing private data
• No guarantee against all anti-bot systems (Cloudflare, etc.)
• Rate limits apply (100 requests/hour on free tier)
• Very large pages (>5MB) may timeout
• Some sites require specific headers we don't support yet
• Not for bypassing authentication or accessing private data
Roadmap
- Proxy rotation pools (Q1 2025)
- CAPTCHA solving integration (Q2 2025)
- Structured data extraction (JSON-LD, microdata)
- Scheduled scraping jobs
- Webhook delivery