How to Scrape Any Website with the Scrapling Agent
Arise · 2026-03-13 · 6 min read
Web Scraping in 2026 Is Broken — Until Now
Websites fight scrapers harder than ever. Cloudflare Turnstile, JavaScript-rendered SPAs, dynamic CSS selectors that change every deploy, anti-bot fingerprinting — a scraper that works today is broken next week.
Scrapling is an AI-powered web scraping agent that handles all of this automatically. You describe what you want extracted. It handles the rest — stealth mode, browser automation, adaptive selectors, session management.
What Scrapling Can Do
- Bypass Cloudflare and other bot detection systems automatically
- Scrape JavaScript-rendered SPAs (React, Vue, Angular) with full browser automation
- Adaptive selectors that survive website redesigns without breaking
- Session management for authenticated scraping behind login walls
- Multiple output formats — JSON, CSV, Markdown, plain text
- Scheduled scraping — run on a cron schedule, get results delivered
- Batch scraping — scrape hundreds of URLs in one command
Installation
Install the AgentPlace CLI, then install Scrapling:
curl -fsSL https://api.agentplace.sh/cli/install | bash
agentplace install scrapling
Basic Usage
Scrape a page and get the full content as Markdown:
agentplace run scrapling --url "https://example.com" --output markdown
Extract structured data with a CSS selector:
agentplace run scrapling \
--url "https://news.ycombinator.com" \
--selector ".titleline > a" \
--output json
Handling Cloudflare-Protected Sites
For sites protected by Cloudflare, Turnstile, or similar systems, enable stealth mode:
agentplace run scrapling \
--url "https://protected-site.com" \
--stealth \
--dynamic \
--output json
The --stealth flag activates advanced fingerprint spoofing and browser automation. The --dynamic flag waits for JavaScript to fully render before extracting.
Authenticated Scraping
For sites that require login, Scrapling handles session management:
config = {
"url": "https://app.example.com/dashboard",
"session": {
"login_url": "https://app.example.com/login",
"credentials": {
"username": "[email protected]",
"password": "your-password"
}
},
"selector": ".data-table",
"output": "json"
}
agentplace run scrapling --config config.json
Adaptive Selectors
The most powerful feature of Scrapling. On first run, save the element fingerprint:
agentplace run scrapling \
--url "https://shop.example.com/products" \
--selector ".product-price" \
--adaptive-save \
--output json
Next time the website redesigns and the selector breaks, Scrapling automatically relocates the element using its saved fingerprint. No more broken scrapers after site updates.
Scheduled Scraping
Set up automatic scraping on a schedule:
agentplace run scrapling \
--url "https://competitor.com/pricing" \
--selector ".price-table" \
--output json \
--schedule "0 9 * * 1" \
--notify email
This runs every Monday at 9am and emails you the extracted data — perfect for competitor price monitoring.
Output Formats
| Format | Use Case | Flag |
|---|---|---|
json |
Structured data, APIs, databases | --output json |
markdown |
LLM input, documentation | --output markdown |
csv |
Spreadsheets, data analysis | --output csv |
text |
Simple extraction, logs | --output text |
Scrapling vs Manual Scraping
| Approach | Setup time | Cloudflare bypass | Selector survival |
|---|---|---|---|
| Requests + BeautifulSoup | 30 min | No | No |
| Playwright/Puppeteer | 2-4 hours | Partial | No |
| ScrapingBee API | 10 min | Yes | No |
| Scrapling Agent | 2 min | Yes | Yes |
Tips for Best Results
- Start with
--dynamicfor any modern website — most use JavaScript rendering - Use
--adaptive-saveon first run of any recurring scrape to protect against redesigns - Test selectors in Chrome DevTools before passing to Scrapling
- Use JSON output when piping data into other tools or databases
- Rate limit with
--delay 2to avoid triggering aggressive bot detection
Conclusion
Web scraping should not require hours of debugging anti-bot systems. Scrapling handles the infrastructure so you can focus on what to extract and what to do with the data.