Overview

Knowledge Bases are organized collections of documents that your agents can search and reference during conversations. Upload your company policies, research papers, technical documentation, and any reference materials to make them instantly searchable by your AI agents.

Key Features

Multi-Format Document Support

Upload documents in various formats and your agents will automatically understand the content:
  • PDF Documents: With OCR fallback for scanned documents
  • Word Documents: .docx and .doc files with formatting preservation
  • Spreadsheets: Excel files (.xlsx, .xls, .xlsm) with table structure conversion
  • CSV Files: Structured data tables with dialect detection
  • Text Files: Markdown, JSON, XML, and plain text
  • HTML Files: .htm and .html files for web content analysis

Intelligent Processing Pipeline

When you upload documents, they go through an advanced processing pipeline:
  1. Secure Storage: Files uploaded to secure cloud storage
  2. Content Extraction: Text extracted from various file formats
  3. Structured Data Extraction: AI extracts structured information using your custom JSON schemas (optional)
  4. Smart Chunking: Documents split while preserving structure and context
  5. Vector Embeddings: AI generates searchable embeddings using your organization’s model
  6. Database Storage: Everything indexed for lightning-fast retrieval

Structured Data Extraction

Configure your knowledge bases to extract structured information from documents:
  • Custom JSON Schemas: Define exactly what data to extract from each file
  • AI Model Selection: Choose which AI model processes your documents
  • Automatic Processing: Extraction happens during file upload
  • Flexible Output: Results stored as structured JSON data alongside searchable text

Advanced Search Capabilities

Your agents can search knowledge bases using the search_knowledge_base tool with:
  • Semantic Search: AI understands context and meaning, not just keywords
  • Hybrid Search: Combines vector similarity with traditional text search
  • File Filtering: Search within specific documents or exclude certain files
  • Ranked Results: Results ordered by relevance with similarity scores

Getting Started

Creating a Knowledge Base

  1. Navigate to Control HubKnowledge Bases
  2. Click “Create Knowledge Base”
  3. Configure your settings:
    • Name & Description: Help your team understand the content
    • Embedding Model: Choose from your organization’s configured models
    • Extraction Schema: Optional JSON schema for structured data extraction

Uploading Documents

  • Drag & Drop: Simply drag files into the knowledge base interface
  • Bulk Upload: Select multiple files at once
  • Real-time Processing: Watch files process with live status updates
  • Processing Status: See extraction progress and any errors

Configuring Agents

Once your knowledge base is ready:
  1. Go to your agent configuration
  2. Enable the “Search Knowledge Base” tool
  3. Your agent can now access and search your documents during conversations

Agent Integration

Natural Language Queries

Your agents can search using natural language:
  • “Find information about our refund policy”
  • “What does the Q3 financial report say about revenue growth?”
  • “Show me technical specifications for our new product”

Advanced Search Options

Agents can also use advanced filtering:
  • Search only within specific files
  • Exclude outdated documents
  • Control the number of results returned
  • Get detailed metadata about search results

Search Results Include

  • Relevant Content: The actual text chunks that match the query
  • Source Information: File names, page numbers, and document metadata
  • Similarity Scores: How relevant each result is to the query
  • Search Method: Whether found via semantic or text search

Use Cases

Customer Support

  • Upload FAQs, product manuals, and policy documents
  • Agents can instantly find answers to customer questions
  • Ensure consistent, accurate responses across your team

Research & Analysis

  • Store research papers, market reports, and analysis documents
  • Agents can synthesize information across multiple sources
  • Extract insights and trends from large document collections

Technical Documentation

  • Upload API docs, system specifications, and troubleshooting guides
  • Agents can help with code reviews and technical questions
  • Keep documentation searchable and accessible
  • Store contracts, regulations, and compliance documents
  • Agents can quickly reference relevant policies and procedures
  • Ensure adherence to legal requirements and standards

Organization & Management

Organization-Scoped

  • Each knowledge base belongs to your organization
  • Admin controls for secure document management
  • Team members see only knowledge bases they have access to

File Management

  • View all uploaded files with processing status
  • Remove outdated or incorrect documents
  • Monitor storage usage and document counts

Performance Monitoring

  • Track search usage and performance
  • Monitor embedding generation status
  • View extracted structured data

Best Practices

Document Organization

  1. Create Topic-Specific Bases: Separate knowledge bases for different subject areas
  2. Use Descriptive Names: Clear names help agents and users understand content
  3. Regular Updates: Remove outdated documents to maintain search quality
  4. Consistent Formatting: Well-formatted documents produce better search results

Search Optimization

  1. Structured Content: Use headings, bullet points, and clear sections
  2. Complete Information: Include context and background in documents
  3. Avoid Duplicates: Multiple versions of the same content can confuse search
  4. Test Searches: Verify your agents can find key information

Security Considerations

  1. Sensitive Data: Only upload documents appropriate for AI processing
  2. Access Controls: Use organization-level access management
  3. Regular Audits: Review uploaded content periodically
  4. Compliance: Ensure uploaded documents comply with your data policies

API Integration

Knowledge bases integrate seamlessly with the Aster Agents API. Use the search functionality programmatically or build custom workflows that leverage your document collections. For detailed API documentation, see the API Reference section.
Transform your documents into searchable knowledge that your agents can access instantly. Knowledge bases make your information work harder for your team, providing AI-powered insights from your existing documentation and files.