Price: $1499.00
Rating: 0.0/5 (0 reviews)
Sold by: Adam Smith
Category: E-books
Most RAG systems that fail in production don't fail because of the model — they fail because of chunking decisions made in a notebook, retrieval pipelines that looked fine on twenty documents, or indexes that quietly go stale. Building with Retrieval works through the full stack: embedding models, vector stores, hybrid search, reranking, and the prompt patterns that keep answers grounded in what was actually retrieved. Marcus Hale draws on real deployment decisions — comparing costs in dollars at production query volumes across Pinecone, pgvector, and Qdrant, weighing LlamaIndex against LangChain for orchestration, and treating provenance as a first-class concern through the Anthropic citations API. It also tells you, with specifics, when RAG is the wrong solution entirely.
Backend and ML engineers who have a RAG prototype working and are now facing the problems tutorials skip: retrieval precision at scale, keeping the index current, measuring whether the system is actually good, and deciding when long context or fine-tuning would serve better than retrieval.