Laude Institute launches Slingshots grants for AI evaluation

The Laude Institute just dropped its first batch of Slingshots grants, targeting one of AI's thorniest problems: how to actually measure what these systems can do. The accelerator program is backing 15 projects focused on AI evaluation, offering the kind of compute power and engineering support that most academic researchers can only dream of.

The Laude Institute just made a major play in the AI evaluation space with its debut Slingshots program, and the timing couldn't be more critical. As AI capabilities explode across every sector, the industry is wrestling with a fundamental question: how do you actually measure what these systems can do?

The institute announced 15 projects on Thursday, each tackling different pieces of the AI evaluation puzzle. Unlike traditional academic grants that leave researchers scrambling for compute resources, Slingshots offers the full package - funding, massive compute power, and dedicated engineering support that would make most university labs jealous.

The catch? Recipients need to deliver something concrete, whether that's a startup, open-source code, or another tangible artifact. It's a hybrid model that bridges the gap between academic research and Silicon Valley's move-fast mentality.

Several projects in the cohort should ring bells for anyone following AI development. Terminal Bench is back with its command-line coding benchmark, while the ARC-AGI project continues its long-running quest to create meaningful AGI tests.

But the really interesting action is happening with the newer approaches. Formula Code, a collaboration between CalTech and UT Austin researchers, is building evaluations specifically for AI agents' code optimization skills. Meanwhile, Columbia's BizBench wants to create comprehensive benchmarks for "white-collar AI agents" - the kind that might soon be handling your expense reports or client emails.

The star power extends beyond just the projects. SWE-Bench co-founder John Boda Yang is leading CodeClash, a dynamic competition-based framework that builds on his previous success in AI code evaluation. Yang's worried about something that should keep the entire industry up at night: benchmarks becoming proprietary company tools rather than shared scientific standards.

"I do think people continuing to evaluate on core third-party benchmarks drives progress," Yang told TechCrunch. "I'm a little bit worried about a future where benchmarks just become specific to companies."

That concern hits at the heart of why Slingshots matters. As OpenAI, Google, and Microsoft race to build increasingly capable AI systems, independent evaluation becomes crucial for understanding what these models can actually do - and more importantly, what they can't.

The program's focus on AI evaluation isn't accidental. While flashy demos and impressive benchmarks grab headlines, the hard work of rigorous testing often gets overlooked. Yet without reliable evaluation methods, the industry is essentially flying blind, making claims about AI capabilities that may not hold up under scrutiny.

Other Slingshots projects are exploring equally critical areas. Some are diving into reinforcement learning structures, while others tackle model compression - the art of making AI systems smaller and more efficient without sacrificing performance. Each represents a different bet on where AI development needs the most help.

The accelerator model itself signals a shift in how AI research gets funded and executed. Traditional academic timelines, measured in years, don't match the breakneck pace of AI development. By offering startup-level resources with academic rigor, Slingshots could become a template for bridging that gap.

The Laude Institute's Slingshots program arrives at a pivotal moment for AI development. As the technology races ahead, the ability to rigorously evaluate AI systems becomes more critical than ever. By funding 15 diverse projects with serious resources, the program could help ensure that AI evaluation keeps pace with AI development - preventing the industry from building systems we can't properly understand or measure.

the tech buzz

Laude Institute launches Slingshots grants for AI evaluation

More in AI Research

AI Safety: Can We Trust AI When No One Is Watching?

UC San Diego's AI Lab Gets NVIDIA's Most Powerful Chip

OpenAI Researcher Quits Over Alleged Research Censorship Claims

Google AI Study: 57% Innovation Boost in Transformed Orgs

Google Expands AI Research Partnership with Tel Aviv University

Meta loses AI godfather as Yann LeCun exits for startup

More Articles

Databricks Co-Founder: US Must Go Open Source to Beat China in AI

Google's SIMA 2 AI Masters Gaming to Advance AGI Race

Google Supercharges NotebookLM with 8x Context Window, Custom Chat Goals

Google DeepMind launches AI for Math Initiative with top universities

Trending Now

China's Free AI Models Target Silicon Valley Dominance

Meta AI Lands in Threads DMs as Social Platform Battle Heats Up

Google's AI Overviews Hit 43% of Searches in Seismic Shift

Altman Takes OpenAI's Next Models to Washington This Week

$9 NFC Key Physically Blocks Your Most Addictive Apps

People Also Ask

What is the Laude Institute Slingshots program?

How many projects did Slingshots fund in its first batch?

What makes Slingshots different from traditional academic grants?

Which notable AI evaluation projects are part of Slingshots?

Why is AI evaluation becoming critical for the industry?

What concern does SWE-Bench co-founder have about AI benchmarks?