Sometimes Models Just Do Things

Where AI meets impatience, management consulting meets existential crisis, and prompts meet their match.

Jun 8, 2025
industry-insights

The Executive Guide to LLM Incompetence: Where Your AI Will Embarrass You

Or: How to Avoid Having Your Board Meeting Derailed by a Chatbot That Can’t Count to Ten There’s a moment in every executive’s AI journey that feels a bit like...
May 19, 2025
experiments

Long-Context Financial QA: An Empirical Evaluation of Large Language Models on Financial Document Analysis

Long-Context Financial QA: An Empirical Evaluation of Large Language Models on Financial Document Analysis Executive Summary This whitepaper presents findings from an experimental evaluation of large language models with extended...

Ben Reeve
May 17, 2025
educational

The Judge Behind the Judge: Understanding AI Scoring Models

The Judge Behind the Judge: Understanding AI Scoring Models Introduction: The Silent Kingmakers of AI In the increasingly crowded arena of AI capabilities, the models that evaluate other models may...

Ben Reeve
May 17, 2025
minimum-viable-intelligence

Minimum Viable Intelligence: When Trillion-Parameter Models Meet Five-Minute Solutions

Minimum Viable Intelligence: When Trillion-Parameter Models Meet Five-Minute Solutions I recently found myself staring at a list of AI research papers that Ilya Sutskever claimed are “all you need to...

Ben Reeve
May 15, 2025
industry-insights

The Great AI Agent Delusion: Why the Hype Doesn't Match Reality

The Great AI Agent Delusion: Why the Hype Doesn’t Match Reality In the breathless world of artificial intelligence discourse, few concepts have generated quite as much excitement—or venture capital—as AI...

Ben Reeve
May 12, 2025
experiments

DeepCredit Experiment: Evaluating LLM Performance on Hard CRR Questions

DeepCredit Experiment: Evaluating LLM Performance on Hard CRR Questions Introduction This report details our latest DeepCredit experiment, which represents a significant step toward building a general credit risk analysis agent....

Ben Reeve
May 6, 2025
industry-insights

The Rise of Deep Research Agents: A Strategic Primer for Knowledge Services Executives

The Rise of Deep Research Agents: A Strategic Primer for Knowledge Services Executives Executive Summary Deep Research agents represent a new class of artificial intelligence systems that can autonomously navigate...

Ben Reeve
Apr 26, 2025
industry-insights

When AI Makes Humans the Bottleneck: Knowledge Work's Great Inversion

When AI Makes Humans the Bottleneck: Knowledge Work’s Great Inversion The data deluge was supposed to drown us. Instead, it’s the AI lifeguards making us realize how slowly we swim....

Ben Reeve
Apr 25, 2025
research

Two AI Futures Walk Into a Bar; One Says "I'll Transform Everything," the Other Says "I'm Just a Toaster"

Two AI Futures Walk Into a Bar; One Says “I’ll Transform Everything,” the Other Says “I’m Just a Toaster” There’s something delightful about reading two wildly different visions of our...

Ben Reeve
Apr 25, 2025
research

The Truth About AI "Deception": When Models Don't Say What They "Think"

The Truth About AI “Deception”: When Models Don’t Say What They “Think” Anthropic recently released a paper with the ominous title “Reasoning Models Don’t Always Say What They Think,” which...

Ben Reeve

1 2 3 Next