# Source Jump Penalty: stop answers stitched from unrelated docs

> Source Jump Penalty penalizes answers stitched together across unrelated documents — a classic recipe for confident nonsense.

**Category:** Hallucination Prevention
**Author:** NeuralSeek Team · **Published:** June 9, 2026
**Canonical:** https://neuralseek.ai/ai-grounded/source-jump-penalty
**Section index:** https://neuralseek.ai/ai-grounded

One of the subtlest and most dangerous forms of hallucination is the 'frankenstein' answer: a single sentence whose first half comes from one document and whose second half comes from another, entirely unrelated one. Each fragment is real and individually traceable, which is exactly why this failure is so treacherous — it passes naive grounding checks while asserting something that no single source actually supports. Source Jump Penalty targets precisely this kind of unauthorized splicing.

## What it actually does

The guardrail detects when an answer hops between documents that don't genuinely belong together and penalizes that stitching. Rather than letting the model freely combine fragments from across the candidate set, it biases the system toward answers that hold together coherently within a related group of sources. An answer grounded in one document, or in a set of clearly related documents, scores well; an answer assembled from disconnected pieces is penalized for the jump.

## Why business teams care

Stitched answers are uniquely hazardous because every individual piece is true and traceable to a real document — they sail through grounding checks that only verify whether each fragment exists somewhere. The problem is the combination, which asserts a relationship or conclusion that none of the sources actually makes. Penalizing source jumps closes this loophole, protecting against confident answers that are technically 'sourced' yet entirely manufactured.

## How to tune it in practice

Tune the penalty against how interconnected your knowledge base really is. In a base where each document is self-contained and authoritative on its topic, a strong penalty enforces single-source coherence and prevents risky splicing. In a base where answers legitimately require synthesizing several closely related documents, a more moderate setting allows coherent multi-source answers while still penalizing jumps across genuinely unrelated material. The goal is to permit sensible synthesis while forbidding frankensteining.

## Common failure modes it prevents

The defining failure is the 'plausible composite' — a fluent statement built from two unrelated true fragments that together imply something false. A related failure is 'topic bleed,' where the model pulls a detail from a tangential document because it happened to be in context, attaching it to an answer where it doesn't belong. Source Jump Penalty discourages both by rewarding answers that stay within a coherent evidentiary neighborhood.

## Where it fits in the stack

Source Jump Penalty complements the semantic threshold and coverage controls by judging not just whether an answer is supported, but whether it's supported coherently. The threshold can be satisfied by an answer whose pieces each match something; this control adds the further requirement that those pieces actually belong together. It's the difference between 'every part is grounded' and 'the whole is grounded.'

## Coherence as a control

By rewarding answers grounded in a coherent set of related material, the system produces responses that don't merely cite sources but cite them in a way that genuinely makes sense together. Coherence stops being an accident of how the model happened to combine fragments and becomes an enforced property of every answer.

> Two true fragments from two unrelated documents can combine into one confident lie.

## The takeaway

Source Jump Penalty blocks the frankenstein answer — keeping responses coherent within related sources instead of spliced across unrelated ones, and closing a loophole that naive grounding checks miss entirely.

---

From NeuralSeek's AI Grounded — practical, web-verified guidance on building governed, grounded enterprise AI. NeuralSeek is the model-agnostic, governed AI platform you own: any LLM (swap with no rebuild), your data in your own tenant (cloud or on-prem), 118 guardrails enforced before any action, one container that runs anywhere.
