Sometimes Models Just Do Things
Where AI meets impatience, management consulting meets existential crisis, and prompts meet their match.
Filter by:
All
Industry-developments
Experiments
Tutorials
Industry-insights
Research
Minimum-viable-intelligence
Educational
-
industry-insights
The Executive Guide to LLM Incompetence: Where Your AI Will Embarrass You
Or: How to Avoid Having Your Board Meeting Derailed by a Chatbot That Can’t Count to Ten There’s a moment in every executive’s AI journey that feels a bit like... -
experiments
Long-Context Financial QA: An Empirical Evaluation of Large Language Models on Financial Document Analysis
Long-Context Financial QA: An Empirical Evaluation of Large Language Models on Financial Document Analysis Executive Summary This whitepaper presents findings from an experimental evaluation of large language models with extended... -
educational
The Judge Behind the Judge: Understanding AI Scoring Models
The Judge Behind the Judge: Understanding AI Scoring Models Introduction: The Silent Kingmakers of AI In the increasingly crowded arena of AI capabilities, the models that evaluate other models may... -
minimum-viable-intelligence
Minimum Viable Intelligence: When Trillion-Parameter Models Meet Five-Minute Solutions
Minimum Viable Intelligence: When Trillion-Parameter Models Meet Five-Minute Solutions I recently found myself staring at a list of AI research papers that Ilya Sutskever claimed are “all you need to... -
industry-insights
The Great AI Agent Delusion: Why the Hype Doesn't Match Reality
The Great AI Agent Delusion: Why the Hype Doesn’t Match Reality In the breathless world of artificial intelligence discourse, few concepts have generated quite as much excitement—or venture capital—as AI... -
experiments
DeepCredit Experiment: Evaluating LLM Performance on Hard CRR Questions
DeepCredit Experiment: Evaluating LLM Performance on Hard CRR Questions Introduction This report details our latest DeepCredit experiment, which represents a significant step toward building a general credit risk analysis agent.... -
industry-insights
The Rise of Deep Research Agents: A Strategic Primer for Knowledge Services Executives
The Rise of Deep Research Agents: A Strategic Primer for Knowledge Services Executives Executive Summary Deep Research agents represent a new class of artificial intelligence systems that can autonomously navigate... -
industry-insights
When AI Makes Humans the Bottleneck: Knowledge Work's Great Inversion
When AI Makes Humans the Bottleneck: Knowledge Work’s Great Inversion The data deluge was supposed to drown us. Instead, it’s the AI lifeguards making us realize how slowly we swim.... -
research
Two AI Futures Walk Into a Bar; One Says "I'll Transform Everything," the Other Says "I'm Just a Toaster"
Two AI Futures Walk Into a Bar; One Says “I’ll Transform Everything,” the Other Says “I’m Just a Toaster” There’s something delightful about reading two wildly different visions of our... -
research
The Truth About AI "Deception": When Models Don't Say What They "Think"
The Truth About AI “Deception”: When Models Don’t Say What They “Think” Anthropic recently released a paper with the ominous title “Reasoning Models Don’t Always Say What They Think,” which...