Building a RAG pipeline that answers from real content, not hallucinations
Generic AI chatbots give confident wrong answers. We built a production RAG pipeline with semantic search, BM25 hybrid retrieval, and anti-hallucination guardrails that answers only from what it actually knows.
Client: SaaS platform (MyQRGuide)
Retrieval hit rate (relevant source in top 5)
Questions answered from grounded context
Empty context when data exists
Average RAG confidence score
The challenge
- Hallucinated product names, prices, and URLs that did not exist
- No source grounding: answers were not traceable to actual content
- Generic fallback answers when questions got specific
- No confidence scoring: no way to detect when AI was guessing
- Pinecone retrieval returning irrelevant chunks with no quality threshold
What we built
- MySQL vector store (guidy_knowledge_chunks) with cosine similarity search, replacing Pinecone for read-heavy workloads
- OpenAI text-embedding-3-large (3072 dimensions) for higher semantic accuracy on domain-specific content
- Sentence-aware chunking (~2,800 chars, 350 overlap) preserving context across chunk boundaries
- Hybrid BM25 + semantic retrieval: dense vectors for paraphrased queries, BM25 for exact product names
- MMR reranking: top-50 candidates reranked to 8-10 maximizing relevance and diversity
- Anti-hallucination prompt rules: answer only from context, cite sources, return "not in knowledge base" below 0.45 confidence
- RAG temperature set to 0.1 for near-deterministic factual answers
The results
Retrieval hit rate (relevant source in top 5)
RAG path vs general fallback
Empty context when data exists
Average RAG confidence score
All 4 production metrics achieved. Hallucination rate meaningfully reduced vs. baseline generic AI implementation.
The outcome
Guidy went from a generic chatbot that hallucinated confidently to a grounded knowledge assistant that answers accurately from real content and says "I don't know" when it doesn't. The pipeline architecture, chunking, embedding, MySQL vector store, and hybrid retrieval, is now a repeatable internal capability deployable for healthcare document Q&A, enterprise knowledge bases, and e-commerce product assistants.
Related work
MyQRGuide: Post-Sale Customer Platform
Ready to build something like this?
Let's talk about your project.