Jina AI Reader
Overview
This connector allows access to the Jina Reader API using two modes:
- "Reader" Mode (
https://r.jina.api
) - Given a set of one or more URLs as input, return the content of those pages as Markdown text. The Reader endpoint extracts the core content from a URL and converting it into clean, LLM-friendly text, ensuring high-quality input for your agent and RAG systems. - "Search" Mode (
https://s.jina.api
) - Similar to the reader endpoint, but accepting a search prompt and returning the text from top 5 search results. Both of these API endpoints will generate human readable markdown, which can also be efficiently processed by downstream LLM and GenAI applications. Both modes can be utilized in the same sync, following the configuration instructions below.
Available Streams
Read output is based on input content, but the json format doesn't differ in the response, Example:
In the above links, replace the substring after base url https://r.jina.ai/
or https://s.jina.ai/
with the url or search prompt to get the results
If there are more endpoints you'd like to support, please Create an issue.
Features
Feature | Supported? |
---|---|
Full Refresh Sync | Yes |
Incremental Sync | No |
SSL connection | Yes |
Namespaces | No |
Getting started
Requirements
- Jina AI Bearer Token (For higher rate limits)
- Reader URL
- Search prompt
Setup guide
Goto https://jina.ai/reader/#apiform
for the complete guide about different pricing and tokens for that.
The website also provides a free bearer token for testing with its interface.
Reference
Config fields reference
Field
Type
Property name
string
api_key
string
read_prompt
string
search_prompt
Set this as true for creating "Buttons & Links" section at the end. This helps the downstream LLMs or web agents navigating the page or take further actions.
boolean
gather_links
Set this as true for creating "Images" section at the end. This gives the downstream LLMs an overview of all visuals on the page, which may improve reasoning.
boolean
gather_images
Changelog
Expand to review
Version | Date | Pull Request | Subject |
---|---|---|---|
0.1.24 | 2024-12-21 | 50115 | Update dependencies |
0.1.23 | 2024-12-14 | 49274 | Starting with this version, the Docker image is now rootless. Please note that this and future versions will not be compatible with Airbyte versions earlier than 0.64 |
0.1.22 | 2024-12-12 | 48929 | Update dependencies |
0.1.21 | 2024-11-04 | 48170 | Update dependencies |
0.1.20 | 2024-10-28 | 47085 | Update dependencies |
0.1.19 | 2024-10-12 | 46768 | Update dependencies |
0.1.18 | 2024-10-05 | 46446 | Update dependencies |
0.1.17 | 2024-09-28 | 46205 | Update dependencies |
0.1.16 | 2024-09-21 | 45827 | Update dependencies |
0.1.15 | 2024-09-14 | 45565 | Update dependencies |
0.1.14 | 2024-09-07 | 45286 | Update dependencies |
0.1.13 | 2024-08-31 | 45015 | Update dependencies |
0.1.12 | 2024-08-24 | 44641 | Update dependencies |
0.1.11 | 2024-08-17 | 44235 | Update dependencies |
0.1.10 | 2024-08-12 | 43916 | Update dependencies |
0.1.9 | 2024-08-10 | 43469 | Update dependencies |
0.1.8 | 2024-08-03 | 43126 | Update dependencies |
0.1.7 | 2024-07-27 | 42675 | Update dependencies |
0.1.6 | 2024-07-20 | 42361 | Update dependencies |
0.1.5 | 2024-07-13 | 41692 | Update dependencies |
0.1.4 | 2024-07-10 | 41594 | Update dependencies |
0.1.3 | 2024-07-09 | 41245 | Update dependencies |
0.1.2 | 2024-07-06 | 40880 | Update dependencies |
0.1.1 | 2024-06-25 | 40359 | Update dependencies |
0.1.0 | 2024-06-25 | 39515 | Add Jina AI source |