What is a Knowledge Base?
A knowledge base is the collection of information your knowledge agent can search and reference when answering questions or making decisions. Think of it as your agent’s memory - the more relevant knowledge you add, the more helpful and accurate your agent becomes. Unlike traditional chatbots that only know what they were trained on, knowledge agents use Retrieval Augmented Generation (RAG) to dynamically search your knowledge and incorporate it into responses.How Knowledge Retrieval Works (RAG Simplified)
Here’s what happens when someone asks your knowledge agent a question:- You can add lots of knowledge without “retraining”
- Answers come from your actual documents
- You can update knowledge anytime
- Agent cites sources (useful for verification)
Supported Knowledge Sources
You can train your knowledge agent with six types of content:| Source Type | What It’s For | Processing Time |
|---|---|---|
| Files | PDFs, Word docs, text files | 10-30 seconds |
| URLs | Web pages, articles, documentation | 15-45 seconds |
| YouTube | Video transcripts | 20-60 seconds |
| Google Docs | Workspace documents | 10-30 seconds |
| Google Sheets | Spreadsheet data | 10-30 seconds |
| Twitter/X | Tweets and threads | 15-45 seconds |
| Profiles and posts | 20-60 seconds |
- Chunked into searchable segments
- Embedded as vectors for semantic search
- Stored in your agent’s vector database
- Instantly available for retrieval
Adding Knowledge: Step-by-Step
Files (PDF, DOCX, TXT)
Best for: Documentation, reports, guides, research papers How to add:- Navigate to your knowledge agent builder
- Click the “Training” tab
- Click the “Files” sub-tab
- Click “Upload” or drag and drop files
- Wait for processing (progress bar shows status)
- File appears in the list when ready
- PDF (.pdf)
- Microsoft Word (.doc, .docx)
- Plain text (.txt)
- Markdown (.md)
- PDFs work best when they contain actual text (not scanned images)
- Remove unnecessary pages to improve relevance
- File names help the agent understand context - use descriptive names
File size limit: 25MB per file. For larger documents, consider splitting them or using a URL if the content is available online.
Web URLs
Best for: Websites, blog posts, online documentation, public articles How to add:- Go to the “Training” tab
- Click the “URLs” sub-tab
- Paste the full URL (starting with https://)
- Click “Add URL”
- Content is scraped and processed automatically
- Main text content from the page
- Headings and structure
- Some metadata (title, author if available)
- Not extracted: Images, videos, interactive elements
- Paste the sharing link (make sure it’s accessible via link)
- Agent automatically exports to readable format
- Formatting is preserved
- Paste the sharing link
- Data is exported and indexed
- Useful for product catalogs, pricing, data tables
Pro tip: For documentation sites with many pages, add the most important/overview pages. You don’t need every single page - the agent will direct users based on what you’ve added.
YouTube Videos
Best for: Tutorials, presentations, interviews, educational content How to add:- Go to the “Training” tab
- Click the “YouTube Videos” sub-tab
- Paste the YouTube video URL
- Click “Add Video”
- Agent automatically extracts the transcript
- Full transcript of spoken words
- Video title and description
- Channel information
- Key moments/chapters (if available)
- Video must have captions/subtitles (auto-generated works)
- Videos without transcripts cannot be processed
- Transcript language is auto-detected
- Index your tutorial videos so agent can answer “how-to” questions
- Add conference talks or presentations
- Include product demos or walkthroughs
- Reference expert interviews or talks
Twitter/X Posts
Best for: Twitter threads, announcements, thought leadership content How to add:- Go to the “Training” tab
- Click the “Twitter” sub-tab
- Enter a Twitter username (without @) or paste a tweet URL
- Click “Add”
- Tweet text content
- Thread structure (if it’s a thread)
- Author information
- Timestamps
- Add your own tweets to train agent on your thinking
- Include industry expert threads
- Reference announcement tweets
- Capture Twitter-based discussions
LinkedIn Content
Best for: Professional profiles, thought leadership posts, company updates How to add:- Go to the “Training” tab
- Click the “LinkedIn” sub-tab
- Enter a LinkedIn profile URL or post URL
- Click “Add”
- Profile headline and about section
- Recent posts and articles
- Experience and background (for profiles)
- Post content and engagement
- Add your LinkedIn profile to train agent on your expertise
- Include company LinkedIn posts
- Reference industry leader profiles
- Capture professional insights and articles
Managing Your Knowledge Base
Viewing Your Knowledge
In the Training tab, you’ll see all your knowledge sources listed with:- Source name/title
- Type (file, URL, video, etc.)
- Upload date
- Processing status
- File size or length
Refreshing Content
For URLs, YouTube videos, and social media sources, you can refresh the content to get updates:- Find the source in the list
- Click the refresh icon next to it
- Agent re-fetches and re-processes the content
- Updated content replaces the old version
- Documentation has been updated
- YouTube video captions were improved
- Twitter thread was extended
- Website content changed
Files can’t be refreshed - you’ll need to delete and re-upload if you have a newer version.
Deleting Knowledge
To remove a knowledge source:- Find it in the Training tab list
- Click the delete icon (trash can)
- Confirm deletion
- Source is removed immediately from knowledge base
Organizing Your Knowledge
While there’s no folder structure, you can organize by:- Using clear, descriptive file names
- Adding related content in batches
- Keeping a separate document tracking what you’ve added
- Deleting outdated content regularly
Knowledge Base Best Practices
Quality Over Quantity
Don’t do this:- Upload hundreds of barely relevant documents
- Add your entire website including footer text and navigation
- Include duplicate or very similar content
- Add content “just in case”
- Curate high-quality, relevant sources
- Include core documentation and key resources
- Remove or don’t include boilerplate/duplicate content
- Think “What do users actually need to know?”
Keep Content Fresh
- Review quarterly: Check if knowledge is still accurate
- Update when things change: New product features, policy changes, etc.
- Remove outdated info: Delete deprecated content
- Refresh URLs: Re-fetch content from living documents
Structure Matters
Good knowledge sources:- Well-organized with clear headings
- Use bullet points and lists
- Have logical flow
- Include examples and specifics
- Wall of text with no structure
- Overly vague or general
- Lots of irrelevant tangents
- Poorly formatted (weird spacing, encoding issues)
Match Your Use Case
For Q&A agents:- Add FAQs, help docs, policies
- Include troubleshooting guides
- Add product documentation
- Add research papers and reports
- Include industry analysis
- Add expert content and thought leadership
- Add process documentation
- Include how-to guides
- Add standard operating procedures
Test Your Knowledge
After adding knowledge, test if the agent can retrieve it:- Ask direct questions from the content
- Ask questions that require combining multiple sources
- Try edge cases or less obvious questions
- Check if sources are cited correctly
- Question may not match terminology in knowledge
- Content may be too scattered or vague
- May need more (or different) context
- Try rephrasing the question
Troubleshooting Knowledge Issues
Agent isn't using my knowledge
Agent isn't using my knowledge
Symptoms: Agent gives generic answers instead of using uploaded contentPossible causes:
- Knowledge still processing (check for status indicator)
- Question doesn’t semantically match content
- System prompt doesn’t encourage knowledge use
- Content is too vague or poorly structured
- Wait for all files to finish processing
- Ask questions more directly related to your content
- Update system prompt: “Always search your knowledge base first”
- Restructure content with clear headings and sections
- Try asking: “What do you know about [topic from your knowledge]?”
Agent retrieves wrong or irrelevant knowledge
Agent retrieves wrong or irrelevant knowledge
Symptoms: Agent cites sources but they’re not relevant to the questionPossible causes:
- Knowledge base has too much content
- Multiple sources with similar but different info
- Content lacks clear topic markers
- Semantic search matching wrong chunks
- Remove less relevant sources
- Add more specific/targeted knowledge
- Use clearer headings in source documents
- Be more specific in questions
- Consider splitting large documents into focused pieces
Upload fails or gets stuck
Upload fails or gets stuck
Symptoms: File upload never completes or shows errorPossible causes:
- File too large (>25MB limit)
- File format not supported
- File is corrupted or password-protected
- Network connection issue
- Check file size (compress or split if too large)
- Convert to supported format (PDF, DOCX, TXT)
- Remove password protection
- Try uploading again with stable connection
- For large documents, try URL if available online
YouTube video transcript not extracting
YouTube video transcript not extracting
Symptoms: Error when adding YouTube videoPossible causes:
- Video doesn’t have captions/transcripts
- Video is private or age-restricted
- Captions are disabled by creator
- Invalid YouTube URL
- Check if video has captions (watch on YouTube first)
- Use public, unrestricted videos
- Ensure URL is correct YouTube format
- Try a different video if captions unavailable
Google Docs/Sheets not loading
Google Docs/Sheets not loading
Symptoms: Can’t add Google Workspace contentPossible causes:
- Sharing settings not set to “Anyone with the link”
- Document is private
- Requires authentication to access
- Invalid share link
- Change sharing to “Anyone with the link can view”
- Copy the full sharing URL (should have /edit or /view)
- Make sure document isn’t restricted to your organization
- Test link in incognito browser to verify public access
How do I know which knowledge was used?
How do I know which knowledge was used?
Answer: As a builder, when you test your knowledge agent, you can see knowledge retrieval:
- Look for [file search] indicator in responses
- Agent may cite sources in its answer
- Some responses show which documents were referenced
Knowledge Base Limits
Per agent:- No hard limit on number of sources
- Recommended: 50-100 high-quality sources for best performance
- Each file limited to 25MB
- Files process individually (can upload multiple at once)
- URLs are processed on-demand
- Large knowledge bases may have slightly slower retrieval
- Knowledge is stored in vector database
- Counts toward your plan’s storage limits
- Deleted knowledge is removed from storage
Advanced Tips
Create a “master FAQ” documentInstead of uploading 20 separate PDFs, create one well-structured FAQ document with all common questions. Use clear headings like ”## Pricing Questions” and ”## Feature Questions”. This helps retrieval accuracy.
Use knowledge categoriesName your files descriptively and consider prefixes:
- “[POLICY] Refund Policy.pdf”
- “[GUIDE] Getting Started Guide.pdf”
- “[FAQ] Common Questions.pdf”
Test with “knowledge audit” questionsAfter adding knowledge, ask: “What do you know about [topic]?” or “What information do you have about [subject]?” This shows you what the agent can access.
Combine sources for depthFor comprehensive topics, add multiple source types:
- Documentation (files)
- Tutorial video (YouTube)
- FAQ page (URL)
- Expert thread (Twitter)
Keep a knowledge changelogTrack what you’ve added and when. This helps you:
- Remember what’s in the knowledge base
- Know when content was last updated
- Identify gaps in coverage
- Plan future additions
Next Steps
Now that you understand the knowledge base system:Configure Your Agent
Write system prompts that encourage knowledge use
Add Tools
Combine knowledge with action-taking capabilities
Best Practices
Learn optimization strategies for knowledge bases
Troubleshooting
Solve common knowledge retrieval issues
Remember: Your knowledge base is living and evolving. Start with core content, test with real questions, and continuously refine based on what works. Quality, relevance, and organization matter more than quantity.

