Skip to main content

Jina AI Reader

Overview

This connector allows access to the Jina Reader API using two modes:

  • "Reader" Mode (https://r.jina.api) - Given a set of one or more URLs as input, return the content of those pages as Markdown text. The Reader endpoint extracts the core content from a URL and converting it into clean, LLM-friendly text, ensuring high-quality input for your agent and RAG systems.
  • "Search" Mode (https://s.jina.api) - Similar to the reader endpoint, but accepting a search prompt and returning the text from top 5 search results. Both of these API endpoints will generate human readable markdown, which can also be efficiently processed by downstream LLM and GenAI applications. Both modes can be utilized in the same sync, following the configuration instructions below.

Available Streams

Read output is based on input content, but the json format doesn't differ in the response, Example:

In the above links, replace the substring after base url https://r.jina.ai/ or https://s.jina.ai/ with the url or search prompt to get the results

If there are more endpoints you'd like to support, please Create an issue.

Features

FeatureSupported?
Full Refresh SyncYes
Incremental SyncNo
SSL connectionYes
NamespacesNo

Getting started

Requirements

  • Jina AI Bearer Token (For higher rate limits)
  • Reader URL
  • Search prompt

Setup guide

Goto https://jina.ai/reader/#apiform for the complete guide about different pricing and tokens for that. The website also provides a free bearer token for testing with its interface.

Reference

Config fields reference

Field
Type
Property name
string
api_key
string
read_prompt
string
search_prompt
Set this as true for creating "Buttons & Links" section at the end. This helps the downstream LLMs or web agents navigating the page or take further actions.
boolean
gather_links
Set this as true for creating "Images" section at the end. This gives the downstream LLMs an overview of all visuals on the page, which may improve reasoning.
boolean
gather_images

Changelog

Expand to review
VersionDatePull RequestSubject
0.1.242024-12-2150115Update dependencies
0.1.232024-12-1449274Starting with this version, the Docker image is now rootless. Please note that this and future versions will not be compatible with Airbyte versions earlier than 0.64
0.1.222024-12-1248929Update dependencies
0.1.212024-11-0448170Update dependencies
0.1.202024-10-2847085Update dependencies
0.1.192024-10-1246768Update dependencies
0.1.182024-10-0546446Update dependencies
0.1.172024-09-2846205Update dependencies
0.1.162024-09-2145827Update dependencies
0.1.152024-09-1445565Update dependencies
0.1.142024-09-0745286Update dependencies
0.1.132024-08-3145015Update dependencies
0.1.122024-08-2444641Update dependencies
0.1.112024-08-1744235Update dependencies
0.1.102024-08-1243916Update dependencies
0.1.92024-08-1043469Update dependencies
0.1.82024-08-0343126Update dependencies
0.1.72024-07-2742675Update dependencies
0.1.62024-07-2042361Update dependencies
0.1.52024-07-1341692Update dependencies
0.1.42024-07-1041594Update dependencies
0.1.32024-07-0941245Update dependencies
0.1.22024-07-0640880Update dependencies
0.1.12024-06-2540359Update dependencies
0.1.02024-06-2539515Add Jina AI source
Was this page helpful?