Knowledge Bases

Overview

Knowledge Bases allow you to upload documents that your AI agents can search through and reference. They support advanced features like structured data extraction and intelligent file filtering.

Creating a Knowledge Base

Navigate to Control Hub → Knowledge Bases
Click Create Knowledge Base
Configure the following settings:

Basic Settings

Name: A descriptive name for your knowledge base
Description: Optional description of the content and purpose
Embedding Model: Choose the AI model for generating search embeddings

Structured Data Extraction (Optional)

Extraction Model: AI model to use for structured data extraction
Extraction Schema: JSON Schema defining what data to extract from each document

Supported File Types

Upload any of these file formats:

File Type	Formats	Notes
PDF Documents	.pdf	Text extraction with OCR fallback for scanned documents
Word Documents	.docx, .doc	Full formatting preservation
Spreadsheets	.xlsx, .xls, .xlsm	Table structure conversion with formula support
CSV Files	.csv	Structured data processing with automatic dialect detection
Text Files	.txt, .md, .json, .xml	Plain text processing

File Processing Pipeline

When you upload files, they go through this automated pipeline:

Upload: Secure cloud storage
Content Extraction: Text extracted from various formats
Structured Extraction: AI extracts data based on your schema (if configured)
Intelligent Chunking: Documents split while preserving structure
Vector Embeddings: AI generates searchable embeddings
Storage: Everything indexed for fast retrieval

Structured Data Extraction

What is Structured Extraction?

Structured extraction uses AI to automatically pull specific information from your documents into a consistent JSON format. This is perfect for:

Financial Reports: Extract revenue, costs, and key metrics
Contracts: Pull out dates, parties, and terms
Research Papers: Extract methodology, results, and conclusions
Product Specs: Standardize technical specifications

Setting Up Extraction

Choose an Extraction Model: Select from your available chat models
Define JSON Schema: Specify the structure you want extracted

Example JSON Schema

{
  "type": "object",
  "properties": {
    "title": {
      "type": "string",
      "description": "Document title or heading"
    },
    "key_metrics": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "metric": {"type": "string"},
          "value": {"type": "number"},
          "unit": {"type": "string"}
        }
      },
      "description": "Important numerical data points"
    },
    "summary": {
      "type": "string",
      "description": "Brief summary of main points"
    }
  },
  "required": ["title", "summary"]
}

Viewing Extracted Data

Once processing completes, you can:

View extracted JSON data in the file details
Use the data in custom workflows
Export for analysis in other tools

Agent Integration

Enabling Knowledge Base Access

Edit your agent configuration
Enable the Search Knowledge Base tool
Select which knowledge bases the agent can access

Advanced Search Features

Your agents can use sophisticated search capabilities:

File Filtering

{
  "query": "quarterly revenue",
  "knowledge_base_id": 123,
  "include_file_ids": [45, 67],  // Search only these files
  "exclude_file_ids": [12, 34],  // Exclude these files
  "top_k": 10
}

Search Types

Vector Search: Semantic similarity using AI embeddings
Text Search: Traditional keyword matching
Hybrid Search: Combines both approaches for best results

Best Practices

Organization

Separate by Topic: Create different knowledge bases for different subjects
Regular Updates: Remove outdated documents to maintain search quality
Descriptive Names: Use clear, searchable file names

Structured Extraction

Start Simple: Begin with basic schemas and expand over time
Test with Sample Files: Validate your schema works before bulk uploads
Use Required Fields: Mark essential fields as required in your schema

Performance

Optimize File Sizes: Larger files take longer to process
Monitor Processing: Check the processing status of uploaded files
Use File Filtering: Help agents focus on relevant documents

Managing Knowledge Bases

File Management

Upload Status: Monitor processing progress for each file
Reprocess Files: Re-run extraction if you update your schema
Delete Files: Remove individual files without affecting others

Knowledge Base Settings

Update Schema: Modify extraction schema for future uploads
Change Models: Switch embedding or extraction models as needed
Access Control: Configure which agents can access the knowledge base

Troubleshooting

Processing Failures: Check file format compatibility
Poor Search Results: Consider adjusting your embedding model
Extraction Issues: Validate your JSON schema syntax

API Access

Knowledge bases are also accessible via API for programmatic integration:

# Search a knowledge base
curl -X POST "https://asteragents.com/api/kb/search" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "revenue trends",
    "knowledge_base_id": 123,
    "top_k": 5
  }'

Pricing Notes

File processing and storage are included in your plan
Embedding generation uses your model provider credits
Structured extraction uses your chat model credits
No additional fees for search operations

Knowledge bases provide a powerful way to give your AI agents access to your organization’s documents while maintaining full control over access and processing.

Get Started

Features

Integrations

Core Tools

Control Hub

Social Media Tools

Database Tools

Google Sheets Tools

Advanced Tools

Overview

Creating a Knowledge Base

Basic Settings

Structured Data Extraction (Optional)

Supported File Types

File Processing Pipeline

Structured Data Extraction

What is Structured Extraction?

Setting Up Extraction

Example JSON Schema

Viewing Extracted Data

Agent Integration

Enabling Knowledge Base Access

Advanced Search Features

File Filtering

Search Types

Best Practices

Organization

Structured Extraction

Performance

Managing Knowledge Bases

File Management

Knowledge Base Settings

Troubleshooting

API Access

Pricing Notes

Get Started

Features

Integrations

Core Tools

Control Hub

Social Media Tools

Database Tools

Google Sheets Tools

Advanced Tools

​Overview

​Creating a Knowledge Base

​Basic Settings

​Structured Data Extraction (Optional)

​Supported File Types

​File Processing Pipeline

​Structured Data Extraction

​What is Structured Extraction?

​Setting Up Extraction

​Example JSON Schema

​Viewing Extracted Data

​Agent Integration

​Enabling Knowledge Base Access

​Advanced Search Features

​File Filtering

​Search Types

​Best Practices

​Organization

​Structured Extraction

​Performance

​Managing Knowledge Bases

​File Management

​Knowledge Base Settings

​Troubleshooting

​API Access

​Pricing Notes

Overview

Creating a Knowledge Base

Basic Settings

Structured Data Extraction (Optional)

Supported File Types

File Processing Pipeline

Structured Data Extraction

What is Structured Extraction?

Setting Up Extraction

Example JSON Schema

Viewing Extracted Data

Agent Integration

Enabling Knowledge Base Access

Advanced Search Features

File Filtering

Search Types

Best Practices

Organization

Structured Extraction

Performance

Managing Knowledge Bases

File Management

Knowledge Base Settings

Troubleshooting

API Access

Pricing Notes