Overview
Knowledge Bases allow you to upload documents that your AI agents can search through and reference. They support advanced features like structured data extraction and intelligent file filtering.Creating a Knowledge Base
- Navigate to Control Hub → Knowledge Bases
- Click Create Knowledge Base
- Configure the following settings:
Basic Settings
- Name: A descriptive name for your knowledge base
- Description: Optional description of the content and purpose
- Embedding Model: Choose the AI model for generating search embeddings
Structured Data Extraction (Optional)
- Extraction Model: AI model to use for structured data extraction
- Extraction Schema: JSON Schema defining what data to extract from each document
Supported File Types
Upload any of these file formats:File Type | Formats | Notes |
---|---|---|
PDF Documents | Text extraction with OCR fallback for scanned documents | |
Word Documents | .docx, .doc | Full formatting preservation |
Spreadsheets | .xlsx, .xls, .xlsm | Table structure conversion with formula support |
CSV Files | .csv | Structured data processing with automatic dialect detection |
Text Files | .txt, .md, .json, .xml | Plain text processing |
File Processing Pipeline
When you upload files, they go through this automated pipeline:- Upload: Secure cloud storage
- Content Extraction: Text extracted from various formats
- Structured Extraction: AI extracts data based on your schema (if configured)
- Intelligent Chunking: Documents split while preserving structure
- Vector Embeddings: AI generates searchable embeddings
- Storage: Everything indexed for fast retrieval
Structured Data Extraction
What is Structured Extraction?
Structured extraction uses AI to automatically pull specific information from your documents into a consistent JSON format. This is perfect for:- Financial Reports: Extract revenue, costs, and key metrics
- Contracts: Pull out dates, parties, and terms
- Research Papers: Extract methodology, results, and conclusions
- Product Specs: Standardize technical specifications
Setting Up Extraction
- Choose an Extraction Model: Select from your available chat models
- Define JSON Schema: Specify the structure you want extracted
Example JSON Schema
Viewing Extracted Data
Once processing completes, you can:- View extracted JSON data in the file details
- Use the data in custom workflows
- Export for analysis in other tools
Agent Integration
Enabling Knowledge Base Access
- Edit your agent configuration
- Enable the Search Knowledge Base tool
- Select which knowledge bases the agent can access
Advanced Search Features
Your agents can use sophisticated search capabilities:File Filtering
Search Types
- Vector Search: Semantic similarity using AI embeddings
- Text Search: Traditional keyword matching
- Hybrid Search: Combines both approaches for best results
Best Practices
Organization
- Separate by Topic: Create different knowledge bases for different subjects
- Regular Updates: Remove outdated documents to maintain search quality
- Descriptive Names: Use clear, searchable file names
Structured Extraction
- Start Simple: Begin with basic schemas and expand over time
- Test with Sample Files: Validate your schema works before bulk uploads
- Use Required Fields: Mark essential fields as required in your schema
Performance
- Optimize File Sizes: Larger files take longer to process
- Monitor Processing: Check the processing status of uploaded files
- Use File Filtering: Help agents focus on relevant documents
Managing Knowledge Bases
File Management
- Upload Status: Monitor processing progress for each file
- Reprocess Files: Re-run extraction if you update your schema
- Delete Files: Remove individual files without affecting others
Knowledge Base Settings
- Update Schema: Modify extraction schema for future uploads
- Change Models: Switch embedding or extraction models as needed
- Access Control: Configure which agents can access the knowledge base
Troubleshooting
- Processing Failures: Check file format compatibility
- Poor Search Results: Consider adjusting your embedding model
- Extraction Issues: Validate your JSON schema syntax
API Access
Knowledge bases are also accessible via API for programmatic integration:Pricing Notes
- File processing and storage are included in your plan
- Embedding generation uses your model provider credits
- Structured extraction uses your chat model credits
- No additional fees for search operations