Most RAG tutorials assume you have a team of ML engineers and a $50K infrastructure budget. You don't.You need your company's knowledge searchable by AI, you've got a modest budget, and maybe one technical person (or you're that person). Here's how to build a production RAG pipeline using tools that don't require distributed systems expertise.This tutorial covers ingesting documents from Google Drive, chunking them intelligently, generating embeddings, storing in a vector database, and querying with natural language - all without writing complex backend code.
What you're building
A system where someone asks "What's our return policy for enterprise customers?" and gets an accurate answer based on your actual policy documents, not generic AI responses.The pipeline: Google Drive documents → n8n workflow → text chunking → OpenAI embeddings → Supabase vector storage → query interface → GPT-4 with retrieved context → accurate answer.Cost: $50-150 per month for moderate usage. One person can build and maintain this.
Why this stack works for small teams
n8n: Visual workflow builder. No coding required for most steps. Can self-host for free or use cloud for $20 per month. Handles scheduling, error handling, and orchestration.Supabase: PostgreSQL with vector extension. Familiar database plus vector search. Free tier handles 500MB, paid tier starts at $25 per month. Way cheaper than Pinecone or Weaviate.OpenAI APIs: Embeddings and GPT-4. Pay per use. Text-embedding-3-small costs $0.0001 per 1K tokens. GPT-4-turbo costs $0.01 per 1K input tokens.Google Drive: Where your documents already live. Native integration with n8n. No data migration needed.Total: Free to start (Supabase free tier, n8n self-hosted), or $50-150 per month for production use.
Part 1: Setting up infrastructure
Get your accounts ready
Supabase: Create a free project at supabase.com. Note your project URL and service role key from Settings → API.OpenAI: Get an API key from platform.openai.com. Add $10 credit to start.n8n: Either self-host using Docker (free) or use n8n Cloud (free trial, then $20 per month). We'll assume n8n Cloud for simplicity.Google Drive: You already have this. Just need to authorize n8n to access it.
Configure Supabase for vector search
In Supabase SQL Editor, run this to enable the vector extension:Enable the vector extension, then create a documents table with id, source, content, metadata as JSON, embedding as vector with 1536 dimensions, and created_at timestamp.Create a function called match_documents that takes query_embedding as a vector, match_threshold as a float, and match_count as an integer. The function returns id, source, content, metadata, and similarity score, filtering by similarity threshold and ordering by relevance.Add an index on the embedding column using ivfflat with vector_cosine_ops and lists parameter of 100.This gives you a vector database that can handle hundreds of thousands of documents efficiently.
Part 2: Document ingestion workflow
This n8n workflow pulls documents from Google Drive, processes them, and stores embeddings in Supabase.
Workflow overview
The workflow has eight nodes: Schedule trigger (runs daily), Google Drive file list, Filter for supported formats, Download file content, Clean and chunk text, Generate embeddings, Format for Supabase, and Store in database.
Node 1: Schedule trigger
Add a Schedule Trigger node set to run daily at 2am (when you're not working and API costs are lower). This keeps your knowledge base up to date as documents change.For initial testing, use Manual Trigger to run on demand.
Node 2: Google Drive - List files
Add a Google Drive node, action "Get Many Files". Connect your Google account when prompted. Configure to search a specific folder where your knowledge base documents live.Set filter to only get files modified in the last 24 hours (for daily updates). For initial ingestion, remove this filter.
Node 3: Filter supported formats
Add an IF node to filter for document types you can process. Supported: Google Docs, Microsoft Word, PDF, plain text. Skip: images, spreadsheets (need different handling), videos.Check if mimeType contains "document", "pdf", or "text".
Node 4: Google Drive - Download file
Add another Google Drive node, action "Download File". Set File ID to reference the file from the previous node. This gets the actual content.For Google Docs, you can export as plain text. For PDFs, you need to extract text (we'll cover this).
Node 5: Clean and extract text
Add a Code node that takes the file content and extracts clean text. For Google Docs and text files, this is straightforward. For PDFs, you might need a library like pdf-parse (which n8n Cloud supports).The code should remove extra whitespace, normalize line breaks, strip formatting codes, and handle encoding issues.
Node 6: Chunk the text
Still in the Code node, chunk the cleaned text into 400-word segments with 50-word overlap. This ensures each chunk is meaningful but fits within embedding context limits.Return an array where each item contains the chunk text, source document name, chunk index, and original metadata.
Node 7: Generate embeddings
Add an HTTP Request node calling OpenAI embeddings API. Method POST, URL is the embeddings endpoint. Authenticate with header Authorization Bearer followed by your API key.Body includes the chunk content and model text-embedding-3-small. Set batch mode to process all chunks in one call (more efficient).The response contains an array of embeddings matching your input chunks.
Node 8: Format for Supabase
Add a Code node to combine chunks with their embeddings. Map through the results, pairing each chunk with its corresponding embedding vector.Output format: source (document name), content (chunk text), metadata (JSON with chunk index, document type, date), and embedding (the vector array).
Node 9: Store in Supabase
Add an HTTP Request node calling Supabase REST API. Method POST, URL is your Supabase project URL followed by /rest/v1/documents.Headers: apikey with your service role key, Authorization Bearer with the same key, Content-Type application/json, and Prefer return=minimal.Body contains the formatted data from the previous node.Activate the workflow and run it. Check Supabase - you should see documents with embeddings stored.
Part 3: Query workflow
This workflow takes a natural language question and returns an answer based on your documents.
Workflow overview
Six nodes: Webhook trigger (receives question), Generate query embedding, Search Supabase vectors, Format retrieved context, Call GPT-4 with context, and Return answer.
Node 1: Webhook
Add a Webhook node with path /ask-question. Method POST. Expects JSON body with question field.This creates an endpoint you can call from your app, Slack bot, or website.
Node 2: Generate query embedding
HTTP Request node calling OpenAI embeddings API. Same configuration as ingestion, but the input is the user's question from the webhook body.This converts "What's our return policy?" into the same vector space as your documents.
Node 3: Search Supabase
HTTP Request node calling your match_documents function via Supabase RPC endpoint. URL is your project URL followed by /rest/v1/rpc/match_documents.Body contains query_embedding (the vector from previous node), match_threshold of 0.5, and match_count of 5.This returns the 5 most relevant document chunks.
Node 4: Format context
Code node that takes the search results and formats them as context for GPT-4. Combine the content of all retrieved chunks, separated by line breaks. Include source document names.Also extract unique sources so you can cite them in the response.
Node 5: Call GPT-4
HTTP Request node calling OpenAI chat completions API. Model gpt-4-turbo-preview. Temperature 0.3 for consistent factual responses.System message: "You are a helpful assistant answering questions about company policies and procedures. Base your answers on the provided context. If the context doesn't contain relevant information, say so. Always cite which documents you're referencing."User message: "Context from company documents: [insert formatted context]. Question: [insert original question]."This generates an answer grounded in your actual documents.
Node 6: Return response
Code node that formats the response with the answer, cited sources, and confidence level (based on similarity scores). Return this via the webhook response.
Part 4: Making it production-ready
The basic pipeline works but needs polish for real use.
Handle PDF text extraction properly
PDFs need special handling. Add a Switch node after download that routes PDFs to a dedicated extraction step using a library like pdf-parse or an external service like Adobe PDF Extract API.For scanned PDFs (images of text), you need OCR. Consider using Google Cloud Vision API or AWS Textract via HTTP Request nodes.
Implement incremental updates
Instead of reprocessing everything daily, track what's changed. Store a last_updated timestamp in Supabase. Only process documents modified since that timestamp.This dramatically reduces embedding costs and processing time.
Add metadata filtering
Enhance your match_documents function to filter by metadata. This lets you search within specific document types, date ranges, or departments.Useful for multi-tenant systems or restricting search scope based on user permissions.
Implement error handling
Add Error Trigger nodes that catch failures and send notifications via email or Slack. Common failures: API rate limits, malformed documents, network timeouts.Build in retry logic with exponential backoff for transient errors.
Set up monitoring
Log all queries to a separate table: question asked, documents retrieved, answer generated, response time. This helps you identify gaps in knowledge and improve the system.Track embedding costs daily to avoid surprises. Set up alerts if costs exceed thresholds.
Create a simple frontend
Build a basic web interface using HTML and JavaScript that calls your webhook. Include a text input for questions and displays answers with source citations.Host it on Vercel or Netlify for free. No complex framework needed.
Part 5: Optimizing for cost and quality
Reduce embedding costs
Use text-embedding-3-small instead of the larger models unless quality suffers. Batch embed multiple chunks in single API calls (cheaper than individual calls). Cache embeddings for documents that don't change.
Improve answer quality
If answers are generic or miss key information, try increasing match_count from 5 to 8-10 chunks (more context). Adjust match_threshold - lower it if you're missing relevant results, raise it if you're getting irrelevant ones. Experiment with chunk size - smaller chunks (300 words) are more precise, larger chunks (600 words) preserve more context.
Handle different document types
Different sources need different handling. For Google Sheets, convert to CSV then process rows as individual chunks. For presentation slides, extract text per slide and treat each slide as a chunk. For images with text, use OCR then process like documents. For code repositories, treat each file as a document with language-specific chunking.
Implement hybrid search
Pure vector search sometimes misses exact keyword matches. Combine it with PostgreSQL full-text search for better recall. Search using both methods, merge results, and deduplicate.
Real-world example: legal knowledge base
We built this for a Melbourne law firm with 200 precedent documents, case notes, and internal memos.Setup time: 3 days (1 day infrastructure, 2 days workflow building and testing).Documents: 180 Word documents, 20 PDFs. Total 2.5M words.Embedding cost: $2.50 for initial ingestion. $0.30 per month for updates.Query volume: 50-100 queries per day from 8 lawyers.Monthly cost: $45 (Supabase Pro, OpenAI API, n8n Cloud).Results: Lawyers find relevant precedents in seconds instead of 15-30 minutes of searching. Accuracy is 85-90% (they validate answers before using). It's not perfect but it's faster than manual search and gets smarter as they add more documents.
Common mistakes to avoid
Chunking too large: Chunks over 600 words dilute relevance. You retrieve documents that contain the answer somewhere but bury it in noise.Not cleaning text: If you don't strip headers, footers, and formatting artifacts, your embeddings are polluted with garbage.Ignoring metadata: Storing just content without source document name, date, or category makes debugging impossible and limits filtering options.No monitoring: You won't know if it's working without logging queries and reviewing results. Assume nothing, validate everything.Treating embeddings as permanent: Documents change. Your embeddings need to update when documents update. Build this from day one.
When to upgrade beyond this stack
This architecture handles up to roughly 50,000 documents and 1,000 queries per day comfortably.Upgrade signals: Query latency exceeds 3-5 seconds consistently. Supabase hits storage or compute limits. You need real-time updates (current approach is batch). You're spending more than $500 per month on APIs and want to optimize. You need advanced features like semantic caching or query decomposition.At that point, consider moving to a dedicated vector database like Qdrant or Weaviate, implementing custom embedding models, building a proper application layer with caching, or hiring engineering help.But honestly, most businesses never hit these limits. The n8n plus Supabase stack scales further than you think.
Getting help when stuck
n8n community forum: Active community, lots of RAG examples, responsive help.Supabase Discord: Great for vector search questions and pgvector optimization.OpenAI documentation: Comprehensive guides on embeddings and best practices.Stack Overflow: Search for "pgvector" and "RAG" questions - most have been answered.If you're genuinely stuck, find a consultant who's actually built RAG systems (not just read about them). A few hours of expert help beats days of trial and error.
Building a RAG system for your business data and want experienced help? We've built dozens of these systems and can guide you through implementation or build it for you.
[Talk to us about RAG implementation]
About ThinkSwift
We're a creative software agency in Melbourne building AI-powered knowledge systems for Australian businesses. We use the n8n plus Supabase stack for most RAG implementations because it's maintainable by small teams, costs 10x less than enterprise solutions, and actually works in production. This tutorial reflects what we actually build for clients.


