| Author(s) | Year | Key Finding | Cited In |
|---|---|---|---|
| Liu et al. | 2024 | “Lost in the Middle” — 30%+ accuracy drop for information placed in mid-context positions | P2, P10 |
| Wu et al. (MIT) | 2025 | U-shaped attention curve caused by causal masking and RoPE — architectural, not patchable by prompting | P2 |
| Hong et al. | 2023 | MetaGPT structured artifacts reduce errors ~40% vs. free dialogue in multi-agent systems | P4, P7 |
| Voyce | 2025 | Prompt structure accounts for up to 40% of performance variance independent of content | P3, P4 |
| Ranjan et al. | 2024 | LLM vocabulary acts as a routing signal in embedding space, activating domain-specific clusters | P6, P10 |
| Zamfirescu-Pereira et al. (CHI) | 2023 | “Why Johnny Can’t Prompt” — positive example + negative example + reason is the strongest prompt structure | P5 |
| LangChain | 2024 | 3 well-chosen examples match 9 in effectiveness for few-shot prompting | P3 |
| PRISM | 2024 | Brief personas (<50 tokens) outperform elaborate ones; flattery actively degrades output quality | P6, P8, P10 |
| DeepMind | 2025 | Multi-agent scaling: effectiveness saturates at 3–4 agents; 7+ agents degrades below 4-agent performance | P9 |
| Captain Agent | 2024 | Adaptive team composition outperforms static composition by 15–25% across benchmarks | P9 |
| MAST Framework | 2024–2025 | 14 failure modes catalogued across communication (4), coordination (5), and quality (5) categories | P7, P8 |
| Anthropic | 2026 | Self-evaluation fails — a generator shares its evaluator’s biases, requiring separate reviewer agents | P6 |
| Anthropic Skill Creator | 2025 | “Explain why things are important in lieu of heavy-handed MUSTs” — BECAUSE clauses outperform imperatives | P5, P10 |
| Vaarta Analytics | 2026 | Structured atomic checks reduce false negatives vs. free-text review at scale | P10 |
| Rasheed et al. | 2024 | “Can LLMs Replace Data Scientists?” — a single expert agent outperforms panels for analytics tasks | P9 |
| Du et al. | 2024 | Multi-agent debate improves reasoning accuracy on structured problems with verifiable answers | P6 |
| Anthropic Harness Design | 2026 | Separation of generator and evaluator prevents shared-bias failures in quality assurance workflows | P6 |
Research Citation Index
Every principle in this series traces to published research. This index collects all 17 sources cited across the 10 articles.