Agents vs. Cloudflare: A python library for not arousing any suspicion
`nothingtoseehere` allows web agents to move the mouse, and type just as imperfectly as a human
I’ve spent all weekend battling Cloudflare. Why? Well, I’m building an agentic business partner that gives local business owners superpowers by managing all their digital tools through a simple, conversational interface.
This means our agents need to be able to download today’s POS orders or view this week’s staff schedule. We do this with the business owner’s explicit consent, and they invite our agent to their POS as an employee, so this is all above board. If you’re reading this because you’re trying to scrape data without people’s consent, or build bots to troll people on the internet, politely f**k off.
This isn’t just a problem for us. 2025 was the year of agents, unfortunately, while the tech advanced a long way, practical applications beyond Manus ordering my groceries for me have failed to catch on. My 2026 prediction is that this is the year we take this game-changing technology and put it to use, and without being dramatic, the year we build a new way of interfacing with technology.
So, how does Cloudflare (and other bot detection) work?
I don’t know. Well, at least I don’t know exactly. They’d be silly to publish exactly how their algorithm works, but they & others have published many architecture & engineering blogs, and there are several research papers on the subject, let alone all the forums of web scrapers sharing their hypotheses. They all agree that one of the most important indicators of whether an end user is human or a bot/agent is tracking mouse movement and keyboard input.
I was surprised to find that, despite mouse movements and keyboard input being so critical to the whole agentic action piece, there were no up-to-date, maintained libraries that solved this issue. Everyone and their nan is using bezier curves, but it’s clear it’d take Cloudflare exactly 3 business seconds to figure out that a perfect bezier curve at a consistent speed is probably not a human.
So I kicked off a few deep-research’s on the topic, and found it fascinating Sunday morning reading.
Here’s a short summary of what I learnt. Or if you want to jump straight to the code, here’s the library.
What makes web interaction “Human”?
Turns out, a lot. And it’s all grounded in neurophysiology research from the past 70 years. So, while none of this research is needed for your agents to use the web, I do recommend a quick read. It’s truly fascinating just how difficult it is to mimic how imperfect we are as humans.
Fitts’ Law describes how humans move - we have a throughput ceiling of about 8-12 bits/second. That means when you ask someone to click a tiny button far away, they slow down. Bots don’t. They zip across the screen at constant speed like they’re on rails. This is detection signal #1.
Velocity profiles are asymmetric. When humans move their mouse, we pick up speed fast (ballistic phase), cover about 95% of the distance, then decelerate more slowly while we visually guide the cursor to the exact target. Peak velocity hits around 38-45% through the movement, not 50% like a perfect sine wave. Bots often have symmetric velocity curves. Detection signal #2.
Nobody moves in straight lines. Even when you think you’re moving straight, you’re not. Human paths have a “straightness index” between 0.80-0.95 (where 1.0 is perfect). We drift, we correct, we overshoot. Modern bot detectors calculate the fractal dimension of your path - humans are around 1.2-1.4, while bots are closer to 1.0 (literally linear). Detection signal #3.
Your hand shakes. Even when you think you’re perfectly still, physiological tremor at 8-12 Hz is always present. FFT analysis on “stationary” cursor segments shows this frequency signature in humans. Bots? Dead silent at 0 Hz, or just white noise. Detection signal #4.
Clicking takes time. The duration between mousedown and mouseup follows a log-normal distribution around 85-130ms. And before clicking, humans pause for 200-500ms to find the cursor. A bot that clicks instantly after stopping? Obvious. Detection signal #5.
You miss. A lot. Humans need corrective submovements about 85% of the time. You make one primary ballistic movement that covers ~95% of the distance, realise you’re slightly off, then make a small correction. A bot that lands pixel-perfect on the first try, every single time? Statistically impossible. We overshoot on large displays, undershoot on small targets, and the initial endpoint error follows a bivariate normal distribution. Detection signal #6.
Faster means shakier. The noise in human movement isn’t constant - it scales with velocity. This is called signal-dependent noise. When you’re moving at 1000 px/s, your path deviates about 20 pixels. When you’re moving at 100 px/s, maybe 2 pixels. Bots often have either zero noise (perfectly smooth) or constant random noise that doesn’t correlate with speed. Detection signal #7.
Typos aren't random. When humans make typing mistakes, they're keyboard-layout-aware. If you meant to type 'a', you might hit 's' or 'w' or 'q' - keys physically adjacent to 'a' on a QWERTY keyboard. You won't randomly hit 'p' or 'm'. Bot typos are often just random character substitutions with no spatial awareness. And when humans catch typos, we don't delete the entire word and retype it - we backspace a few characters and fix just what's wrong. Detection signal #8.
The research goes deeper: reaction times following an ex-Gaussian distribution, double-click spatial drift, and the way path tortuosity increases with cognitive load. If you want all the citations and math, check out the RESEARCH.md in the repo.
Okay cool, so how do I make my agents go brrrr?
Your first imperfect click
Slow down. First, let’s make your clicks worse. Then we can build an agent for you. Let’s get started with the classic, you’ve got two choices,
pip install nothingtoseehere[browser]Which also installs the browser dependencies, or just nothingtoseehere if you’re happy managing the coordinates of clicks yourself.
Then it’s simple, just tell us where to click, and include the click target size (important as humans struggle to click smaller things, so this effects speed).
import asyncio
from nothingtoseehere import NeuromotorInput
async def main():
human = NeuromotorInput()
# Move and click with human-like movement
await human.mouse.move_to(500, 300, target_width=100, click=True)
# Type with realistic timing
await human.keyboard.type_text("Hello, world!", with_typos=True)
asyncio.run(main())
Now if you want to use a browser (as you probably do), you’ll need some code like this:
browser = await uc.start()
page = await browser.get("https://wikipedia.org")
search = await page.select('input[name="search"]')
await human.click_nodriver_element(search, page)
await human.fill_nodriver_input(search, page, "neurophysiology", with_typos=True)Note: this package is not all the tools you need to feed into your agent, nor is it a solution to all possible actions on the web. With your help, it could be, but at the moment, it’s just the minimum needed for our agents. If you have feedback on the interface, send me a message.
Show me the money demo
While Sequoia has announced that the kind of long-running agents we’re building at Super44 are “basically AGI”, I’m not so sure. What I do know is that when I watch my little agents click around and type there is definitely something parental that kicks in. I think the cutest part is when it makes a keyboard-layout-aware typo (mistyping “a” as “s”) and I get to watch it correct itself. In practice our agents are doing everything from downloading POS data to posting on social media, but for our demo let me use something a little lighter & public, wikipedia.
You can download the repo and give it a try yourself, just run
git clone https://github.com/super-44/nothingtoseehere
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
cd examples
python wikipedia_demo.pyIf you combine this with a simple agent loop with something like the Claude Vision API, and wrap some of these methods in tools, you’re already well on your way towards having a fairly capable web agent that walks (moves) and talks (types) like a human.
So does it work?
Kinda. It’s working for us. Unfortunately, mouse movement and keyboard input are just a small part of the detection process, and I’m sure you’ll read more from me soon that covers the even longer journey of IP rotation, fingerprinting, etc. There are also some fun academic topics, like agentic guardrails & exciting topics, like how agents can learn from their mistakes, and get better at using software over time.
Can I help?
The library is open source, and it’s just a start. I’d love to add profiles to create consistent profiles of inconsistency across sessions, and there’s a lot more we could add already from the research. There are probably bugs and edge cases I haven’t handled. If you find one, fix it!
P.S, if you’re interested in building multi-agent systems, figuring out how agents can use the web, or just working with a small, exceptional and passionate team to build a future where anyone can open a local business, send us an email at hello (at) super44.ai.
P.P.S If you work at Cloudflare and you’re reading this, reach out to me. I’d love to chat about the future of ethical agents on the web.


