Databricks-Web-Browse
Databricks-Web-Browse is a live-fetch agent operated by Databricks. It does not crawl the web on a schedule. It hits your site only when an end-user asks the underlying AI a question that requires fresh information from a specific page.
Traffic is bursty and unpredictable. A single trending topic can send hundreds of Databricks-Web-Browse requests in an hour, then nothing for days. Each request typically reads one or two pages, not your whole site.
Allowing Databricks-Web-Browse is how your content becomes part of Databricks's answers. Blocking it means users asking that AI about your topic will be answered using someone else's content instead.
See Databricks-Web-Browse on your own site
Match the User-Agent header on incoming requests against the pattern below.
regex
For higher confidence, also verify the source IP against the operator's published ranges. UA strings can be spoofed; IP ownership is harder to fake.
Renders JavaScript
Sometimes
IP verification
User-Agent only
Crawl frequency
Burst, user-driven
Honors robots.txt
Yes
Honors Crawl-delay
Varies
Databricks runs 2 bots in total. Each one is a separate user-agent so you can allow or block them independently.
Training Crawler
1Live-Fetch AI
1- Databricks-Web-BrowseYou are here
Should I let Databricks-Web-Browse through?
In most cases, yes. Live-fetch agents drive citations inside AI answers. Allowing keeps your content in the conversation. If volume gets noisy, rate-limit it before you block it outright.
Does blocking Databricks-Web-Browse affect my Google rankings?
No. Databricks-Web-Browse fetches a page only when a user is actively asking Databricks a question. It has nothing to do with how Google or Bing rank you. The cost of blocking is that Databricks can't quote your content in its answer.
How do I confirm a request is really from Databricks-Web-Browse?
Look at the User-Agent header in your access logs and match it against the strings listed above. Worth knowing that the User-Agent is easy to fake, so this check tells you "the traffic claims to be Databricks-Web-Browse", not "the traffic is genuinely Databricks-Web-Browse". If you need stronger guarantees, look for a reverse-DNS check or wait for Databricks to publish IP ranges.
Does a Databricks-Web-Browse visit count as a real user visit?
Sort of. There is a human asking Databricks a question on the other end, but they never load your page in their own browser. They see whatever Databricks quotes back, usually a snippet plus a citation link. Count it as upstream attention rather than as a session.
How is Databricks-Web-Browse different from Databricks's other bots?
Databricks splits work across multiple user-agents so site owners can decide on each one independently. Training crawlers, live-fetch agents, search indexers, and agentic browsers each get their own name. Worth scanning the rest of the Databricks family above to see which ones actually matter for your site.
What's the cleanest way to control Databricks-Web-Browse?
Two layers. Robots.txt for the polite crawlers that read it, and rules at your CDN or edge for the ones that don't. Rankly's Agent Experience handles both from a single config, so you can allow, block, rate-limit, or serve a stripped-down version per bot. Agent Analytics handles the observation half so you know which bots are actually worth a rule.