Databricks-Web-Browse

01 /Overview

Databricks-Web-Browse is a live-fetch agent operated by Databricks. It does not crawl the web on a schedule. It hits your site only when an end-user asks the underlying AI a question that requires fresh information from a specific page.

Traffic is bursty and unpredictable. A single trending topic can send hundreds of Databricks-Web-Browse requests in an hour, then nothing for days. Each request typically reads one or two pages, not your whole site.

Allowing Databricks-Web-Browse is how your content becomes part of Databricks's answers. Blocking it means users asking that AI about your topic will be answered using someone else's content instead.

See Databricks-Web-Browse on your own site

02 /Identification

Match the User-Agent header on incoming requests against the pattern below.

regex

Databricks-Web-Browse

For higher confidence, also verify the source IP against the operator's published ranges. UA strings can be spoofed; IP ownership is harder to fake.

03 /Control

The polite way is a robots.txt rule. Compliant agents respect it; the others ignore it.

robots.txt

User-agent: Databricks-Web-Browse Disallow: /

Test a URL

Paste any URL on your site and we'll fetch its robots.txt to check whether Databricks-Web-Browse is allowed.

04 /Technical fingerprint

Renders JavaScript

Sometimes

IP verification

User-Agent only

Crawl frequency

Burst, user-driven

Honors robots.txt

Yes

Honors Crawl-delay

Varies

05 /Expected behavior

Expect bursty, unpredictable hits. Databricks-Web-Browse does not crawl your site, it visits specific pages when an end-user asks the underlying AI a question that needs fresh data. Volume tracks user queries, not a schedule.

The Databricks bot family

Databricks runs 2 bots in total. Each one is a separate user-agent so you can allow or block them independently.

Training Crawler

Databricks-Mosaic-Bot

Live-Fetch AI

Databricks-Web-BrowseYou are here

Common questions

Should I let Databricks-Web-Browse through?

In most cases, yes. Live-fetch agents drive citations inside AI answers. Allowing keeps your content in the conversation. If volume gets noisy, rate-limit it before you block it outright.

Does blocking Databricks-Web-Browse affect my Google rankings?

No. Databricks-Web-Browse fetches a page only when a user is actively asking Databricks a question. It has nothing to do with how Google or Bing rank you. The cost of blocking is that Databricks can't quote your content in its answer.

How do I confirm a request is really from Databricks-Web-Browse?

Look at the User-Agent header in your access logs and match it against the strings listed above. Worth knowing that the User-Agent is easy to fake, so this check tells you "the traffic claims to be Databricks-Web-Browse", not "the traffic is genuinely Databricks-Web-Browse". If you need stronger guarantees, look for a reverse-DNS check or wait for Databricks to publish IP ranges.

Does a Databricks-Web-Browse visit count as a real user visit?

Sort of. There is a human asking Databricks a question on the other end, but they never load your page in their own browser. They see whatever Databricks quotes back, usually a snippet plus a citation link. Count it as upstream attention rather than as a session.

How is Databricks-Web-Browse different from Databricks's other bots?

Databricks splits work across multiple user-agents so site owners can decide on each one independently. Training crawlers, live-fetch agents, search indexers, and agentic browsers each get their own name. Worth scanning the rest of the Databricks family above to see which ones actually matter for your site.

What's the cleanest way to control Databricks-Web-Browse?

Two layers. Robots.txt for the polite crawlers that read it, and rules at your CDN or edge for the ones that don't. Rankly's Agent Experience handles both from a single config, so you can allow, block, rate-limit, or serve a stripped-down version per bot. Agent Analytics handles the observation half so you know which bots are actually worth a rule.

Agent Directory See all bots

Databricks-Web-Browse

01 /Overview

Allowing Databricks-Web-Browse is how your content becomes part of Databricks's answers. Blocking it means users asking that AI about your topic will be answered using someone else's content instead.

See Databricks-Web-Browse on your own site

02 /Identification

Match the User-Agent header on incoming requests against the pattern below.

regex

Databricks-Web-Browse

For higher confidence, also verify the source IP against the operator's published ranges. UA strings can be spoofed; IP ownership is harder to fake.

03 /Control

The polite way is a robots.txt rule. Compliant agents respect it; the others ignore it.

robots.txt

User-agent: Databricks-Web-Browse Disallow: /

Test a URL

Paste any URL on your site and we'll fetch its robots.txt to check whether Databricks-Web-Browse is allowed.

04 /Technical fingerprint

Renders JavaScript

Sometimes

IP verification

User-Agent only

Crawl frequency

Burst, user-driven

Honors robots.txt

Yes

Honors Crawl-delay

Varies

05 /Expected behavior

The Databricks bot family

Databricks runs 2 bots in total. Each one is a separate user-agent so you can allow or block them independently.

Training Crawler

Databricks-Mosaic-Bot

Live-Fetch AI

Databricks-Web-BrowseYou are here

Common questions

Should I let Databricks-Web-Browse through?

In most cases, yes. Live-fetch agents drive citations inside AI answers. Allowing keeps your content in the conversation. If volume gets noisy, rate-limit it before you block it outright.