Use RAG when facts change and citations matter
If your users need answers grounded in manuals, policies, tickets, or contracts, RAG is the default in 2026 because freshness and provenance beat memorization. The work is not “vector database in a diagram.” The work is chunking, access control at retrieval time, re-ranking, evaluation sets grounded in real operator questions, and policies for refusal when evidence is weak.
Use fine-tuning when you need a stable behavioral prior
Fine-tuning still earns budget when you have high-quality labelled pairs that teach tone, format, or a narrow classification boundary, and when those patterns do not change weekly. It is weaker as a substitute for a living document corpus. The model cannot cite what it memorised, and updates require retraining discipline.
Use agents when actions are part of the product
Agents matter when the system must update tickets, issue refunds within policy, query internal APIs, or orchestrate multi-step workflows with retries and compensating steps. Agents are not “RAG plus confidence.” They are policy-bound tool use with explicit failure handling. If your workflow is mostly read-only Q&A, an agent buys complexity you do not need.
Composition patterns that actually ship
- RAG for evidence + deterministic rules for calculations and invariants.
- Fine-tuned classifiers for cheap triage + RAG for answer composition on the surviving slice.
- Agents with narrow tool matrices + RAG for knowledge retrieval + humans for high-stakes writes.
Opinion: ignore vendor diagrams that show everything connected to everything
The winning architectures we see in 2026 are boring on purpose: explicit states, explicit tool permissions, explicit eval gates. The flashy demo is a single graph node labeled “AI” with arrows to sixteen systems. Production is smaller graphs with logs.
How to decide in one workshop
Ask: what is the user decision at the end? Approve, deny, post, route, or only understand? If it is only understanding, start with RAG. If it is classification with stable labels, consider fine-tuning. If it is action under constraints, design the agent policy matrix first, then decide whether retrieval is even in scope.
Proof beats philosophy
If you are stuck in a debate, run a Rapid POC that implements two architectures on the same acceptance tests. The spreadsheet beats the whiteboard. We routinely recommend the simpler architecture when it clears the bar, because simpler is what your team will operate at 2 a.m. on a Sunday.