Tools
Tools are external capabilities an AI agent can invoke.
Why you need tools
Tools allow agents to perceive, decide, and act beyond their training data. Think of Airbyte's agent connectors as collections of tools that fulfill this purpose.
Large language models, by default, lack real-time knowledge, are stateless, and can't act on and verify facts. This places limits on their capabilities. Tools expand the capabilities of an LLM. Tools are callable functions, services, and interfaces that an AI agent can use to:
- Retrieve information it doesn't have
- Perform computations or transformations
- Interact with external systems
- Trigger side-effects, like sending emails, updating databases, and triggering workflows
How to call tools
When you expose a connector as a tool for an AI agent, the agent needs to understand what entities and actions are available, what parameters each action requires, and how to paginate through results. Without this information in the tool description, agents make incorrect API calls or require extra discovery calls to understand the API.
Airbyte agent connectors provide two approaches for defining tools.
- Manual docstrings for fine-grained control
- Auto-generated descriptions for comprehensive coverage with minimal code
Generally, you want to use auto-generated descriptions unless you have a reason not to. For example, you might specifically want to avoid adding a tool, or autogenerated docstrings could be insufficient for your needs.
Manual docstrings
You can manually define individual tools with hand-written docstrings. This approach works well when you want to expose only specific operations or need custom parameter handling. However, it requires writing and maintaining docstrings for each tool, and the agent only knows about the operations you define.
from pydantic_ai import Agent
from airbyte_agent_github import GithubConnector
agent = Agent("openai:gpt-4o")
connector = GithubConnector(auth_config=...)
@agent.tool_plain
async def list_issues(owner: str, repo: str, limit: int = 10) -> str:
"""List open issues in a GitHub repository."""
result = await connector.issues.list(owner=owner, repo=repo, states=["OPEN"], per_page=limit)
return str(result.data)
The docstring is the tool's description, which helps the LLM understand when to use it. The function parameters become the tool's input schema, so the LLM knows what arguments to provide.
Auto-generated descriptions
For comprehensive tool coverage, use the @Connector.describe decorator. This decorator reads the connector's metadata and automatically generates a detailed docstring that includes all available entities, actions, parameters, and response structures.
from pydantic_ai import Agent
from airbyte_agent_github import GithubConnector
agent = Agent("openai:gpt-4o")
connector = GithubConnector(auth_config=...)
@agent.tool_plain
@GithubConnector.describe
async def github_execute(entity: str, action: str, params: dict | None = None):
return await connector.execute(entity, action, params or {})
The decorator automatically expands the docstring to include all available entities and actions, their required and optional parameters, response structure details, and pagination guidance. This gives the LLM everything it needs to correctly call the connector without additional discovery calls.
Decorator order matters
When using the describe decorator with agent frameworks like Pydantic AI or FastMCP, decorator order matters. The @Connector.describe decorator must be the inner decorator (closest to the function definition) because frameworks capture docstrings at decoration time.
Correct ordering:
@agent.tool_plain # Outer: framework decorator captures __doc__
@GithubConnector.describe # Inner: sets __doc__ before framework sees it
async def github_execute(entity: str, action: str, params: dict | None = None):
...
If you reverse the order, the framework captures the original docstring before describe has a chance to expand it, and the agent won't see the auto-generated documentation.
Custom docstrings
Generally, you should use a connector's default docstrings. Connectors generate useful docstrings automatically. However, there can be cases where autogenerated docstrings aren't ideal for your agent and it misunderstands how to use the tool. In this case, you can supply your own instructions.
Only provide custom docstrings if you need to, and in limited quantities. Doing this may affect connector operations in ways Airbyte didn't intend, and could cause a future connector version to work incorrectly.
@agent.tool_plain
@GithubConnector.describe
async def github_execute(entity: str, action: str, params: dict | None = None):
"""Execute GitHub operations.
IMPORTANT: entity must be a simple name like 'issues', 'repositories', 'pull_requests'.
Action must be 'list', 'get', or 'api_search'.
Owner/repo info goes in params dict, e.g., params={"owner": "airbytehq", "repo": "airbyte"}
"""
return await connector.execute(entity, action, params or {})
Introspection
Beyond the describe decorator, connectors provide programmatic introspection methods for runtime discovery.
list_entities()
Returns structured data about all available entities, their actions, and parameters.
entities = connector.list_entities()
for entity in entities:
print(f"{entity['entity_name']}: {entity['available_actions']}")
# Output: customers: ['list', 'get', 'search']
Each entity description includes the entity name, a description, available actions, and detailed parameter information for each action including parameter names, types, whether they're required, and their location (path, query, or body).
entity_schema()
Returns the JSON schema for a specific entity. This is useful for understanding the structure of returned data.
schema = connector.entity_schema("customers")
if schema:
print(f"Customer properties: {list(schema.get('properties', {}).keys())}")