Knowledge Bases - Aster Agents

Overview

Knowledge Bases are organized collections of documents that your agents can search and reference during conversations. Upload your company policies, research papers, technical documentation, and any reference materials to make them instantly searchable by your AI agents.

Key Features

Multi-Format Document Support

Upload documents in various formats and your agents will automatically understand the content:

File Type	Formats	Notes
PDF Documents	.pdf	Text extraction with OCR fallback for scanned documents
Word Documents	.docx	Full formatting preservation
PowerPoint	.pptx	Slide content extraction
Spreadsheets	.xlsx, .xls, .xlsm	Table structure conversion
CSV Files	.csv	Structured data with automatic dialect detection
Images	.png, .jpg, .jpeg, .gif, .bmp, .webp	AI vision for text and content extraction
Text Files	.txt, .md, .json, .xml	Plain text processing
HTML Files	.htm, .html	Web content analysis
YAML Files	.yaml, .yml	Configuration and data files
Outlook Messages	.msg	Email content extraction

File size limit: 250MB per file

Intelligent Processing Pipeline

When you upload documents, they go through an advanced processing pipeline:

Secure Storage: Files uploaded to secure cloud storage
Content Extraction: Text extracted from various file formats
Structured Data Extraction: AI extracts structured information using your custom JSON schemas (optional)
Smart Chunking: Documents split while preserving structure and context
Vector Embeddings: AI generates searchable embeddings using your organization’s model
Database Storage: Everything indexed for lightning-fast retrieval

Structured Data Extraction

Configure your knowledge bases to extract structured information from documents, creating an index of numeric and categorical fields for cross-document analysis:

Custom JSON Schemas: Define exactly what data to extract from each file
AI Model Selection: Choose which AI model processes your documents
Automatic Processing: Extraction happens during file upload
Cross-Document Queries: Filter and aggregate across all files without scanning each one

For detailed guidance on writing extraction schemas, see Extraction Schemas.

Advanced Search Capabilities

Your agents can search knowledge bases using the search_knowledge_base tool with:

Semantic Search: AI understands context and meaning, not just keywords
Hybrid Search: Combines vector similarity with traditional text search
File Filtering: Search within specific documents or exclude certain files
Ranked Results: Results ordered by relevance with similarity scores

Getting Started

Creating a Knowledge Base

Navigate to Control Hub → Knowledge Bases
Click “Create Knowledge Base”
Configure your settings:
- Name & Description: Help your team understand the content
- Embedding Model: Choose from your organization’s configured models
- Extraction Schema: Optional JSON schema for structured data extraction

Uploading Documents

Drag & Drop: Simply drag files into the knowledge base interface
Bulk Upload: Select multiple files at once
Real-time Processing: Watch files process with live status updates
Processing Status: See extraction progress and any errors

Email Documents to Your Knowledge Base

Every knowledge base has a unique email address that you can use to add documents without opening the app. How to use:

Find your KB’s email address in the upload area (e.g., [email protected])
Send an email with attachments to that address
Documents are automatically added and processed

Security:

Only organization members can add documents via email
Sender email is verified against your team’s Clerk user accounts
Unauthorized senders are silently ignored

Use cases:

Forward emails with attachments directly to your KB
Add documents from your phone without logging in
Set up automated workflows that email documents to knowledge bases
Quickly share files from any device with email access

Trigger Agents on File Upload

Knowledge bases can automatically start an AI agent conversation whenever a file finishes processing. This enables powerful document automation workflows. How to configure:

Go to your knowledge base settings (click edit)
Enable “Trigger Agent on File Upload”
Select which agent should process the documents
Write instructions telling the agent what to do with the document

What the agent receives:

The full extracted text content of the document
Structured extraction data (if you’ve configured an extraction schema)
The filename and knowledge base context

Example instructions:

Analyze this document and:
Summarize the key points in 3-5 bullet points
Extract any action items or deadlines mentioned
Identify the document type (invoice, contract, report, etc.)

Viewing triggered conversations:

Files that triggered a conversation show a “View Conversation” option in their menu
The conversation is linked to the file for easy reference

Use cases:

Invoice Processing: Automatically extract line items and totals from uploaded invoices
Contract Analysis: Summarize key terms and flag important clauses
Report Summarization: Generate executive summaries of lengthy documents
Content Routing: Have an agent read documents and route them to appropriate teams

Notes:

Triggers fire for all upload methods: drag-and-drop, bulk upload, and email-to-KB
Only files uploaded by organization members trigger conversations (the uploader becomes the conversation owner)
If a file fails processing, no trigger fires until the file is successfully processed

Configuring Agents

Once your knowledge base is ready:

Go to your agent configuration
Enable the “Search Knowledge Base” tool
Your agent can now access and search your documents during conversations

Agent Integration

Natural Language Queries

Your agents can search using natural language:

“Find information about our refund policy”
“What does the Q3 financial report say about revenue growth?”
“Show me technical specifications for our new product”

Advanced Search Options

Agents can also use advanced filtering:

Search only within specific files
Exclude outdated documents
Control the number of results returned
Get detailed metadata about search results

Search Results Include

Relevant Content: The actual text chunks that match the query
Source Information: File names, page numbers, and document metadata
Similarity Scores: How relevant each result is to the query
Search Method: Whether found via semantic or text search

Use Cases

Customer Support

Upload FAQs, product manuals, and policy documents
Agents can instantly find answers to customer questions
Ensure consistent, accurate responses across your team

Research & Analysis

Store research papers, market reports, and analysis documents
Agents can synthesize information across multiple sources
Extract insights and trends from large document collections

Technical Documentation

Upload API docs, system specifications, and troubleshooting guides
Agents can help with code reviews and technical questions
Keep documentation searchable and accessible

Compliance & Legal

Store contracts, regulations, and compliance documents
Agents can quickly reference relevant policies and procedures
Ensure adherence to legal requirements and standards

Integration-Managed Knowledge Bases

Some knowledge bases are automatically populated by external integrations (e.g., Salesforce, SharePoint, or custom sync services). These are called integration-managed knowledge bases.

How to Identify

A badge appears next to the knowledge base name showing the integration source
An info banner displays at the top of the knowledge base detail page
Example: “Managed by Salesforce”

What’s Different

Feature	User-Managed KB	Integration-Managed KB
Upload files	✅ Yes	❌ No (synced automatically)
Delete files	✅ Yes	❌ No (synced automatically)
Retry failed processing	✅ Yes	✅ Yes
Search & query	✅ Yes	✅ Yes
View extracted data	✅ Yes	✅ Yes
Delete knowledge base	✅ Yes	❌ No (via integration settings)

Why This Matters

Integration-managed KBs are kept in sync with external systems. If you were to manually upload or delete files, those changes would be lost on the next sync. The UI protections ensure your data stays consistent with the source system.

Organization & Management

Organization-Scoped

Each knowledge base belongs to your organization
Admin controls for secure document management
Team members see only knowledge bases they have access to

File Management

View all uploaded files with processing status
Remove outdated or incorrect documents
Monitor storage usage and document counts

Performance Monitoring

Track search usage and performance
Monitor embedding generation status
View extracted structured data

Best Practices

Document Organization

Create Topic-Specific Bases: Separate knowledge bases for different subject areas
Use Descriptive Names: Clear names help agents and users understand content
Regular Updates: Remove outdated documents to maintain search quality
Consistent Formatting: Well-formatted documents produce better search results

Search Optimization

Structured Content: Use headings, bullet points, and clear sections
Complete Information: Include context and background in documents
Avoid Duplicates: Multiple versions of the same content can confuse search
Test Searches: Verify your agents can find key information

Security Considerations

Sensitive Data: Only upload documents appropriate for AI processing
Access Controls: Use organization-level access management
Regular Audits: Review uploaded content periodically
Compliance: Ensure uploaded documents comply with your data policies

API Integration

Knowledge bases integrate seamlessly with the Aster Agents API. Use the search functionality programmatically or build custom workflows that leverage your document collections. For detailed API documentation, see the API Reference section.

Transform your documents into searchable knowledge that your agents can access instantly. Knowledge bases make your information work harder for your team, providing AI-powered insights from your existing documentation and files.

Get Started

Agents & Features

Organization Settings

Integrations

Tools Reference

Advanced

​Overview

​Key Features

​Multi-Format Document Support

​Intelligent Processing Pipeline

​Structured Data Extraction

​Advanced Search Capabilities

​Getting Started

​Creating a Knowledge Base

​Uploading Documents

​Email Documents to Your Knowledge Base

​Trigger Agents on File Upload

​Configuring Agents

​Agent Integration

​Natural Language Queries

​Advanced Search Options

​Search Results Include

​Use Cases

​Customer Support

​Research & Analysis

​Technical Documentation

​Compliance & Legal

​Integration-Managed Knowledge Bases

​How to Identify

​What’s Different

​Why This Matters

​Organization & Management

​Organization-Scoped

​File Management

​Performance Monitoring

​Best Practices

​Document Organization

​Search Optimization

​Security Considerations

​API Integration