Skip to main content

Overview

Knowledge Bases are organized collections of documents that your agents can search and reference during conversations. Upload your company policies, research papers, technical documentation, and any reference materials to make them instantly searchable by your AI agents.

Key Features

Multi-Format Document Support

Upload documents in various formats and your agents will automatically understand the content:
File TypeFormatsNotes
PDF Documents.pdfText extraction with OCR fallback for scanned documents
Word Documents.docxFull formatting preservation
PowerPoint.pptxSlide content extraction
Spreadsheets.xlsx, .xls, .xlsmTable structure conversion
CSV Files.csvStructured data with automatic dialect detection
Images.png, .jpg, .jpeg, .gif, .bmp, .webpAI vision for text and content extraction
Text Files.txt, .md, .json, .xmlPlain text processing
HTML Files.htm, .htmlWeb content analysis
YAML Files.yaml, .ymlConfiguration and data files
Outlook Messages.msgEmail content extraction
File size limit: 250MB per file

Intelligent Processing Pipeline

When you upload documents, they go through an advanced processing pipeline:
  1. Secure Storage: Files uploaded to secure cloud storage
  2. Content Extraction: Text extracted from various file formats
  3. Structured Data Extraction: AI extracts structured information using your custom JSON schemas (optional)
  4. Smart Chunking: Documents split while preserving structure and context
  5. Vector Embeddings: AI generates searchable embeddings using your organization’s model
  6. Database Storage: Everything indexed for lightning-fast retrieval

Structured Data Extraction

Configure your knowledge bases to extract structured information from documents, creating an index of numeric and categorical fields for cross-document analysis:
  • Custom JSON Schemas: Define exactly what data to extract from each file
  • AI Model Selection: Choose which AI model processes your documents
  • Automatic Processing: Extraction happens during file upload
  • Cross-Document Queries: Filter and aggregate across all files without scanning each one
For detailed guidance on writing extraction schemas, see Extraction Schemas.

Advanced Search Capabilities

Your agents can search knowledge bases using the search_knowledge_base tool with:
  • Semantic Search: AI understands context and meaning, not just keywords
  • Hybrid Search: Combines vector similarity with traditional text search
  • File Filtering: Search within specific documents or exclude certain files
  • Ranked Results: Results ordered by relevance with similarity scores

Getting Started

Creating a Knowledge Base

  1. Navigate to Control HubKnowledge Bases
  2. Click “Create Knowledge Base”
  3. Configure your settings:
    • Name & Description: Help your team understand the content
    • Embedding Model: Choose from your organization’s configured models
    • Extraction Schema: Optional JSON schema for structured data extraction

Uploading Documents

  • Drag & Drop: Simply drag files into the knowledge base interface
  • Bulk Upload: Select multiple files at once
  • Real-time Processing: Watch files process with live status updates
  • Processing Status: See extraction progress and any errors

Email Documents to Your Knowledge Base

Every knowledge base has a unique email address that you can use to add documents without opening the app. How to use:
  1. Find your KB’s email address in the upload area (e.g., [email protected])
  2. Send an email with attachments to that address
  3. Documents are automatically added and processed
Security:
  • Only organization members can add documents via email
  • Sender email is verified against your team’s Clerk user accounts
  • Unauthorized senders are silently ignored
Use cases:
  • Forward emails with attachments directly to your KB
  • Add documents from your phone without logging in
  • Set up automated workflows that email documents to knowledge bases
  • Quickly share files from any device with email access

Trigger Agents on File Upload

Knowledge bases can automatically start an AI agent conversation whenever a file finishes processing. This enables powerful document automation workflows. How to configure:
  1. Go to your knowledge base settings (click edit)
  2. Enable “Trigger Agent on File Upload”
  3. Select which agent should process the documents
  4. Write instructions telling the agent what to do with the document
What the agent receives:
  • The full extracted text content of the document
  • Structured extraction data (if you’ve configured an extraction schema)
  • The filename and knowledge base context
Example instructions:
Analyze this document and:
1. Summarize the key points in 3-5 bullet points
2. Extract any action items or deadlines mentioned
3. Identify the document type (invoice, contract, report, etc.)
Viewing triggered conversations:
  • Files that triggered a conversation show a “View Conversation” option in their menu
  • The conversation is linked to the file for easy reference
Use cases:
  • Invoice Processing: Automatically extract line items and totals from uploaded invoices
  • Contract Analysis: Summarize key terms and flag important clauses
  • Report Summarization: Generate executive summaries of lengthy documents
  • Content Routing: Have an agent read documents and route them to appropriate teams
Notes:
  • Triggers fire for all upload methods: drag-and-drop, bulk upload, and email-to-KB
  • Only files uploaded by organization members trigger conversations (the uploader becomes the conversation owner)
  • If a file fails processing, no trigger fires until the file is successfully processed

Configuring Agents

Once your knowledge base is ready:
  1. Go to your agent configuration
  2. Enable the “Search Knowledge Base” tool
  3. Your agent can now access and search your documents during conversations

Agent Integration

Natural Language Queries

Your agents can search using natural language:
  • “Find information about our refund policy”
  • “What does the Q3 financial report say about revenue growth?”
  • “Show me technical specifications for our new product”

Advanced Search Options

Agents can also use advanced filtering:
  • Search only within specific files
  • Exclude outdated documents
  • Control the number of results returned
  • Get detailed metadata about search results

Search Results Include

  • Relevant Content: The actual text chunks that match the query
  • Source Information: File names, page numbers, and document metadata
  • Similarity Scores: How relevant each result is to the query
  • Search Method: Whether found via semantic or text search

Use Cases

Customer Support

  • Upload FAQs, product manuals, and policy documents
  • Agents can instantly find answers to customer questions
  • Ensure consistent, accurate responses across your team

Research & Analysis

  • Store research papers, market reports, and analysis documents
  • Agents can synthesize information across multiple sources
  • Extract insights and trends from large document collections

Technical Documentation

  • Upload API docs, system specifications, and troubleshooting guides
  • Agents can help with code reviews and technical questions
  • Keep documentation searchable and accessible
  • Store contracts, regulations, and compliance documents
  • Agents can quickly reference relevant policies and procedures
  • Ensure adherence to legal requirements and standards

Integration-Managed Knowledge Bases

Some knowledge bases are automatically populated by external integrations (e.g., Salesforce, SharePoint, or custom sync services). These are called integration-managed knowledge bases.

How to Identify

  • A badge appears next to the knowledge base name showing the integration source
  • An info banner displays at the top of the knowledge base detail page
  • Example: “Managed by Salesforce”

What’s Different

FeatureUser-Managed KBIntegration-Managed KB
Upload files✅ Yes❌ No (synced automatically)
Delete files✅ Yes❌ No (synced automatically)
Retry failed processing✅ Yes✅ Yes
Search & query✅ Yes✅ Yes
View extracted data✅ Yes✅ Yes
Delete knowledge base✅ Yes❌ No (via integration settings)

Why This Matters

Integration-managed KBs are kept in sync with external systems. If you were to manually upload or delete files, those changes would be lost on the next sync. The UI protections ensure your data stays consistent with the source system.

Organization & Management

Organization-Scoped

  • Each knowledge base belongs to your organization
  • Admin controls for secure document management
  • Team members see only knowledge bases they have access to

File Management

  • View all uploaded files with processing status
  • Remove outdated or incorrect documents
  • Monitor storage usage and document counts

Performance Monitoring

  • Track search usage and performance
  • Monitor embedding generation status
  • View extracted structured data

Best Practices

Document Organization

  1. Create Topic-Specific Bases: Separate knowledge bases for different subject areas
  2. Use Descriptive Names: Clear names help agents and users understand content
  3. Regular Updates: Remove outdated documents to maintain search quality
  4. Consistent Formatting: Well-formatted documents produce better search results

Search Optimization

  1. Structured Content: Use headings, bullet points, and clear sections
  2. Complete Information: Include context and background in documents
  3. Avoid Duplicates: Multiple versions of the same content can confuse search
  4. Test Searches: Verify your agents can find key information

Security Considerations

  1. Sensitive Data: Only upload documents appropriate for AI processing
  2. Access Controls: Use organization-level access management
  3. Regular Audits: Review uploaded content periodically
  4. Compliance: Ensure uploaded documents comply with your data policies

API Integration

Knowledge bases integrate seamlessly with the Aster Agents API. Use the search functionality programmatically or build custom workflows that leverage your document collections. For detailed API documentation, see the API Reference section.
Transform your documents into searchable knowledge that your agents can access instantly. Knowledge bases make your information work harder for your team, providing AI-powered insights from your existing documentation and files.