Back to case studies
AI Engineering · RAG · Knowledge Retrieval

Building a RAG pipeline that answers from real content, not hallucinations

Generic AI chatbots give confident wrong answers. We built a production RAG pipeline with semantic search, BM25 hybrid retrieval, and anti-hallucination guardrails that answers only from what it actually knows.

Client: SaaS platform (MyQRGuide)

LaravelPHPMySQLOpenAI text-embedding-3-largeGPT-4BM25Pinecone
>80%

Retrieval hit rate (relevant source in top 5)

>70%

Questions answered from grounded context

<5%

Empty context when data exists

>0.50

Average RAG confidence score

The challenge

  • Hallucinated product names, prices, and URLs that did not exist
  • No source grounding: answers were not traceable to actual content
  • Generic fallback answers when questions got specific
  • No confidence scoring: no way to detect when AI was guessing
  • Pinecone retrieval returning irrelevant chunks with no quality threshold

What we built

  • MySQL vector store (guidy_knowledge_chunks) with cosine similarity search, replacing Pinecone for read-heavy workloads
  • OpenAI text-embedding-3-large (3072 dimensions) for higher semantic accuracy on domain-specific content
  • Sentence-aware chunking (~2,800 chars, 350 overlap) preserving context across chunk boundaries
  • Hybrid BM25 + semantic retrieval: dense vectors for paraphrased queries, BM25 for exact product names
  • MMR reranking: top-50 candidates reranked to 8-10 maximizing relevance and diversity
  • Anti-hallucination prompt rules: answer only from context, cite sources, return "not in knowledge base" below 0.45 confidence
  • RAG temperature set to 0.1 for near-deterministic factual answers

The results

>80%

Retrieval hit rate (relevant source in top 5)

>70%

RAG path vs general fallback

<5%

Empty context when data exists

>0.50

Average RAG confidence score

All 4 production metrics achieved. Hallucination rate meaningfully reduced vs. baseline generic AI implementation.

The outcome

Guidy went from a generic chatbot that hallucinated confidently to a grounded knowledge assistant that answers accurately from real content and says "I don't know" when it doesn't. The pipeline architecture, chunking, embedding, MySQL vector store, and hybrid retrieval, is now a repeatable internal capability deployable for healthcare document Q&A, enterprise knowledge bases, and e-commerce product assistants.

Related work

Healthcare · Home Health

Care Coordination Platform

LaravelSwiftKotlinMySQLRESTful APIsHIPAA-compliant infrastructure
Read case study
E-commerce · Amazon & eBay Sellers

MyQRGuide: Post-Sale Customer Platform

LaravelMySQLAI ChatbotWhatsApp Business APIEmail delivery
Read case study

Ready to build something like this?

Let's talk about your project.