Scrapling is an open-source Python web scraping framework that combines HTTP fetching, anti-bot bypass, browser automation, and adaptive element tracking into one library. It handles Cloudflare Turnstile out of the box using Camoufox (a patched Firefox), relocates elements automatically when a site redesigns its HTML, and runs concurrent crawls with a Scrapy-like Spider API. One pip install replaces your stack of requests + BeautifulSoup + Selenium + undetected-chromedriver.
Repo: github.com/D4Vinci/Scrapling
The problem nobody talks about
Web scraping in Python has a fragmentation problem. You need requests for HTTP, BeautifulSoup or lxml for parsing, Selenium or Playwright for JS-rendered pages, and a separate tool like cloudscraper or undetected-chromedriver for anti-bot protection. Every project assembles its own Frankenstein stack. When a site changes its HTML structure, your selectors break silently and you find out in production.
The Cloudflare problem is worse. Turnstile detection has gotten aggressive enough that cloudscraper fails on most protected sites as of 2025. The workarounds (residential proxies, headless browser farms) cost real money for what should be a simple GET request.
Why this changes everything
Adaptive element tracking.Scrapling fingerprints DOM elements using multiple signals (tag, attributes, text, position, siblings). When a site changes its layout, Scrapling relocates the element you were targeting without you updating a single selector. This isn't theoretical. It works across real redesigns where class names, IDs, and nesting all change simultaneously.
Cloudflare bypass built in. StealthyFetcher uses Camoufox under the hood, a custom-patched Firefox that passes fingerprint checks. No residential proxies needed for most Cloudflare-protected sites. You call StealthyFetcher.get(url) and it handles the challenge page, waits for the token, and returns the real HTML.
Four fetchers, one API. Fetcher for simple HTTP. StealthyFetcher for anti-bot. DynamicFetcher for full Playwright browser automation. Spider for concurrent multi-page crawls. All four return the same Adaptor object, so your parsing code works identically regardless of how you fetched the page.
Performance. 5.7x faster than Scrapy's parsel on element selection benchmarks (author's claim). Built on lxml with a custom selector engine rather than wrapping BeautifulSoup. Take the number with appropriate skepticism on real-world workloads, but the library is genuinely fast.
Step 1: install
Basic install (parsing + simple HTTP only):
pip install scraplingFull install with all fetchers (StealthyFetcher, DynamicFetcher, Spider):
pip install "scrapling[fetchers]"
scrapling installThe scrapling install command downloads Camoufox and Playwright browsers. Budget ~500MB of disk space and a minute or two on first run. Python 3.9+ required. Works on Mac, Linux, Windows.
Step 2: basic scraping (Fetcher)
The simplest use case. Fetch a page, extract data with CSS selectors:
from scrapling.fetchers import Fetcher
page = Fetcher.get("https://quotes.toscrape.com/")
quotes = page.css(".quote")
for quote in quotes:
text = quote.css_first(".text").text
author = quote.css_first(".author").text
print(f"{text} - {author}")Fetcher impersonates a real browser by default (Chrome TLS fingerprint, realistic headers). No configuration needed. The response is an Adaptor object with .css(), .xpath(), .find(), .find_by_text(), and .find_similar() methods.
For multiple requests to the same domain, use a session:
from scrapling.fetchers import FetcherSession
with FetcherSession(impersonate="chrome") as session:
page1 = session.get("https://example.com/page/1")
page2 = session.get("https://example.com/page/2")Sessions persist cookies and connection pooling across requests.
Step 3: bypassing Cloudflare (StealthyFetcher)
This is the headline feature. For Cloudflare-protected sites:
from scrapling.fetchers import StealthyFetcher
page = StealthyFetcher.get("https://cloudflare-protected-site.com/")
print(page.status) # 200
print(page.css("h1").text)StealthyFetcher launches Camoufox (patched Firefox), solves the Turnstile challenge automatically, waits for the page to fully render, and returns the HTML. First request is slow (~5-10 seconds for the challenge solve). Subsequent requests to the same domain are faster because the session token persists.
You can also pass proxies:
page = StealthyFetcher.get(
"https://protected-site.com/",
proxy="http://user:pass@proxy:8080"
)