# Snippet Size: how much of each source the model gets to see

> Snippet Size controls how much of each source paragraph is forwarded as context — balancing completeness against token efficiency.

**Category:** Retrieval Grounding
**Author:** NeuralSeek Team · **Published:** June 9, 2026
**Canonical:** https://neuralseek.ai/ai-grounded/snippet-size
**Section index:** https://neuralseek.ai/ai-grounded

Once the right documents are chosen and capped, there is a second, subtler question: how much of each one to actually send. Forward too little and you clip the answer mid-thought, cutting off the sentence that contained the real explanation. Forward too much and you pay for paragraphs of surrounding text that have nothing to do with the question. Snippet Size is the dial that gets this balance right, and across millions of calls it is one of the most direct levers you have on both quality and cost.

## What it actually does

For every retrieved source, Snippet Size determines how large a slice of surrounding text accompanies the matched passage. Smaller snippets are lean and cheap, sending just the matched region. Larger snippets preserve more of the context around the match, which matters when an answer depends on the full paragraph — a procedure where the steps build on each other, or a clause whose meaning hinges on the sentence before it. The setting decides how generous that window of surrounding text is.

## Why business teams care

This is a direct lever on both accuracy and spend. Right-sized snippets give the model exactly enough to answer completely without paying for filler. Too-small snippets cause incomplete or subtly wrong answers because the model never saw the qualifying sentence; too-large snippets inflate token costs and can even hurt accuracy by surrounding the relevant text with distractions. Across high volume, trimming snippet bloat is real, recurring money — and tighter context often improves answers by keeping the model on the words that matter.

## How to tune it in practice

Let the structure of your content guide you. If your knowledge base is full of self-contained sentences and short facts, lean toward smaller snippets. If it's full of multi-step procedures, legal clauses, or arguments that unfold across a paragraph, go larger so you never clip the part that carries the meaning. The test is simple: are answers ever incomplete in a way that a little more surrounding text would have fixed? If so, increase the size; if answers are complete but costs feel high, trim it.

## Common failure modes it prevents

The two failures sit at opposite extremes. 'Clipped context' happens when snippets are too small and the model answers from a fragment, confidently missing the caveat in the next sentence. 'Context bloat' happens when snippets are too large and the model's attention is diluted across irrelevant text while you pay for every token. A well-tuned Snippet Size threads between them, delivering complete answers at an efficient price.

## Where it fits in the stack

Snippet Size is the third member of the context-budgeting trio, working alongside Max Docs and Re-Rank. Re-Rank chooses the best sources, Max Docs limits how many of them are sent, and Snippet Size controls how much of each one travels along. Together they decide the precise shape of the evidence the model receives — and therefore the ceiling on how good and how affordable the answer can be.

## Tuned per use case

Procedural or legal content often needs generous snippets so steps and clauses stay intact and nothing critical is severed; short factual lookups thrive on tight ones that keep latency and cost low. The setting flexes to the shape of your content rather than forcing every workflow into the same mold.

> Context is a budget, not a buffet. Spend it on the words that change the answer.

## The takeaway

Snippet Size tunes how much of each source the model sees — enough to answer fully, never so much that you pay for noise — making it one of the highest-leverage settings in the grounding pipeline.

---

From NeuralSeek's AI Grounded — practical, web-verified guidance on building governed, grounded enterprise AI. NeuralSeek is the model-agnostic, governed AI platform you own: any LLM (swap with no rebuild), your data in your own tenant (cloud or on-prem), 118 guardrails enforced before any action, one container that runs anywhere.