Author(s) Year Key Finding Cited In
Liu et al. 2024 “Lost in the Middle” — 30%+ accuracy drop for information placed in mid-context positions P2, P10
Wu et al. (MIT) 2025 U-shaped attention curve caused by causal masking and RoPE — architectural, not patchable by prompting P2
Hong et al. 2023 MetaGPT structured artifacts reduce errors ~40% vs. free dialogue in multi-agent systems P4, P7
Voyce 2025 Prompt structure accounts for up to 40% of performance variance independent of content P3, P4
Ranjan et al. 2024 LLM vocabulary acts as a routing signal in embedding space, activating domain-specific clusters P6, P10
Zamfirescu-Pereira et al. (CHI) 2023 “Why Johnny Can’t Prompt” — positive example + negative example + reason is the strongest prompt structure P5
LangChain 2024 3 well-chosen examples match 9 in effectiveness for few-shot prompting P3
PRISM 2024 Brief personas (<50 tokens) outperform elaborate ones; flattery actively degrades output quality P6, P8, P10
DeepMind 2025 Multi-agent scaling: effectiveness saturates at 3–4 agents; 7+ agents degrades below 4-agent performance P9
Captain Agent 2024 Adaptive team composition outperforms static composition by 15–25% across benchmarks P9
MAST Framework 2024–2025 14 failure modes catalogued across communication (4), coordination (5), and quality (5) categories P7, P8
Anthropic 2026 Self-evaluation fails — a generator shares its evaluator’s biases, requiring separate reviewer agents P6
Anthropic Skill Creator 2025 “Explain why things are important in lieu of heavy-handed MUSTs” — BECAUSE clauses outperform imperatives P5, P10
Vaarta Analytics 2026 Structured atomic checks reduce false negatives vs. free-text review at scale P10
Rasheed et al. 2024 “Can LLMs Replace Data Scientists?” — a single expert agent outperforms panels for analytics tasks P9
Du et al. 2024 Multi-agent debate improves reasoning accuracy on structured problems with verifiable answers P6
Anthropic Harness Design 2026 Separation of generator and evaluator prevents shared-bias failures in quality assurance workflows P6