Haiku-SearchBot
Haiku-SearchBot is a live-fetch agent operated by Clerk Sas. It does not crawl the web on a schedule. It hits your site only when an end-user asks the underlying AI a question that requires fresh information from a specific page.
Traffic is bursty and unpredictable. A single trending topic can send hundreds of Haiku-SearchBot requests in an hour, then nothing for days. Each request typically reads one or two pages, not your whole site.
Allowing Haiku-SearchBot is how your content becomes part of Clerk Sas's answers. Blocking it means users asking that AI about your topic will be answered using someone else's content instead.
See Haiku-SearchBot on your own site
Match the User-Agent header on incoming requests against the pattern below.
regex
For higher confidence, also verify the source IP against the operator's published ranges. UA strings can be spoofed; IP ownership is harder to fake.
Renders JavaScript
Sometimes
IP verification
User-Agent only
Crawl frequency
Burst, user-driven
Honors robots.txt
Yes
Honors Crawl-delay
Varies
Should I let Haiku-SearchBot through?
In most cases, yes. Live-fetch agents drive citations inside AI answers. Allowing keeps your content in the conversation. If volume gets noisy, rate-limit it before you block it outright.
Does blocking Haiku-SearchBot affect my Google rankings?
No. Haiku-SearchBot fetches a page only when a user is actively asking Clerk Sas a question. It has nothing to do with how Google or Bing rank you. The cost of blocking is that Clerk Sas can't quote your content in its answer.
How do I confirm a request is really from Haiku-SearchBot?
Look at the User-Agent header in your access logs and match it against the strings listed above. Worth knowing that the User-Agent is easy to fake, so this check tells you "the traffic claims to be Haiku-SearchBot", not "the traffic is genuinely Haiku-SearchBot". If you need stronger guarantees, look for a reverse-DNS check or wait for Clerk Sas to publish IP ranges.
Does a Haiku-SearchBot visit count as a real user visit?
Sort of. There is a human asking Clerk Sas a question on the other end, but they never load your page in their own browser. They see whatever Clerk Sas quotes back, usually a snippet plus a citation link. Count it as upstream attention rather than as a session.
What's the cleanest way to control Haiku-SearchBot?
Two layers. Robots.txt for the polite crawlers that read it, and rules at your CDN or edge for the ones that don't. Rankly's Agent Experience handles both from a single config, so you can allow, block, rate-limit, or serve a stripped-down version per bot. Agent Analytics handles the observation half so you know which bots are actually worth a rule.