Jina-Reader
Jina-Reader is a live-fetch agent operated by Jina. It does not crawl the web on a schedule. It hits your site only when an end-user asks the underlying AI a question that requires fresh information from a specific page.
Traffic is bursty and unpredictable. A single trending topic can send hundreds of Jina-Reader requests in an hour, then nothing for days. Each request typically reads one or two pages, not your whole site.
Allowing Jina-Reader is how your content becomes part of Jina's answers. Blocking it means users asking that AI about your topic will be answered using someone else's content instead.
See Jina-Reader on your own site
Match the User-Agent header on incoming requests against the pattern below.
regex
For higher confidence, also verify the source IP against the operator's published ranges. UA strings can be spoofed; IP ownership is harder to fake.
Renders JavaScript
Sometimes
IP verification
User-Agent only
Crawl frequency
Burst, user-driven
Honors robots.txt
Yes
Honors Crawl-delay
Varies
Should I let Jina-Reader through?
In most cases, yes. Live-fetch agents drive citations inside AI answers. Allowing keeps your content in the conversation. If volume gets noisy, rate-limit it before you block it outright.
Does blocking Jina-Reader affect my Google rankings?
No. Jina-Reader fetches a page only when a user is actively asking Jina a question. It has nothing to do with how Google or Bing rank you. The cost of blocking is that Jina can't quote your content in its answer.
How do I confirm a request is really from Jina-Reader?
Look at the User-Agent header in your access logs and match it against the strings listed above. Worth knowing that the User-Agent is easy to fake, so this check tells you "the traffic claims to be Jina-Reader", not "the traffic is genuinely Jina-Reader". If you need stronger guarantees, look for a reverse-DNS check or wait for Jina to publish IP ranges.
Does a Jina-Reader visit count as a real user visit?
Sort of. There is a human asking Jina a question on the other end, but they never load your page in their own browser. They see whatever Jina quotes back, usually a snippet plus a citation link. Count it as upstream attention rather than as a session.
What's the cleanest way to control Jina-Reader?
Two layers. Robots.txt for the polite crawlers that read it, and rules at your CDN or edge for the ones that don't. Rankly's Agent Experience handles both from a single config, so you can allow, block, rate-limit, or serve a stripped-down version per bot. Agent Analytics handles the observation half so you know which bots are actually worth a rule.