AI advances surge forward even as experts scramble to explain them
SAN DIEGO — For a week, scholars, startup founders, and researchers from global tech giants gathered in sunny San Diego for what’s considered the premier event in artificial intelligence. The Neural Information Processing Systems conference, known as NeurIPS, has been around for 39 years and this year drew a record 26,000 attendees — roughly double the number from six years prior.
Since its inception in 1987, NeurIPS has explored neural networks and the crossovers between computation, neurobiology, and physics. What began as a niche academic fixation on brain-inspired networks has evolved into a cornerstone of modern AI, transforming NeurIPS from a small hotel gathering into a massive convention-center event that even shares space with Comic-Con.
Even as the conference thrived alongside a booming AI industry and featured sessions on increasingly niche topics like AI-created music, a persistent thread ran through the discussions: the mystery of how frontier AI systems actually work.
Most leading researchers and industry leaders freely acknowledge that they don’t fully understand today’s top AI models. The field’s objective—interpretability—seeks to illuminate how these systems function.
Shriyash Upadhyay, an AI researcher and co-founder of the interpretability-focused startup Martian, described interpretability as a field still in its infancy: “There’s a lot of ferment and competing agendas, and people don’t yet share a single, clear picture of what the field should be.” He compared it to traditional science, where breakthroughs refine measurements, whereas interpretability asks more fundamental questions like, “What does it mean for an AI to be interpretable?” To spur progress, Martian and NeurIPS unveiled a $1 million prize aimed at advancing interpretability.
Throughout the conference, interpretability work from major tech companies revealed a spectrum of approaches to decoding increasingly capable systems. Earlier in the week, Google signaled a shift away from exhaustive reverse-engineering toward more practical, real-world impact.
Neel Nanda, a lead figure in Google’s interpretability team, noted that ambitious goals of fully reverse-engineering models remain distant, and that the work should deliver tangible benefits within about a decade. He cited rapid progress alongside slower-than-expected advances as reasons for reorienting their strategy.
Conversely, OpenAI’s interpretability chief, Leo Gao, announced and elaborated on a deeper, more ambitious effort to understand neural networks at a fundamental level.
Adam Gleave, co-founder of the FAR.AI nonprofit focused on AI safety and education, expressed skepticism about achieving a wholly comprehensible explanation of large-scale neural networks. Still, he believes progress will come in practical, multi-layered understandings of model behavior that can improve reliability and trustworthiness.
Gleave also welcomed a stronger focus on safety and alignment within the machine-learning community, though he pointed to the sheer size of NeurIPS sessions on capability-building as a sign that the field’s priorities can be unevenly distributed.
Beyond interpretability, researchers question how to measure AI systems’ current and future capabilities. Sanmi Koyejo, a Stanford professor and director of the Trustworthy AI Research Lab, argued that existing benchmarks were built for earlier problems and fail to capture broad, general intelligence or reasoning. This underscores the need for better tools and tests that meaningfully assess AI behavior across diverse tasks.
The same measurement challenge applies to AI tools used in biology and other sciences. Ziv Bar-Joseph of Carnegie Mellon, who leads GenBio AI, emphasized that biology-focused evaluations are in the very early stages and that the community is still figuring out the right evaluation framework before deciding what to study at all.
Despite what remains unclear about how these systems operate and how to quantify their progress, researchers see rapid strides in AI’s capacity to accelerate scientific discovery. Upadhyay likened it to inventing tools before fully understanding the underlying physics—progress can outpace complete comprehension, yet still create real-world impact.
For the fourth consecutive year, NeurIPS hosted a dedicated offshoot focused on AI for science, exploring how AI can drive breakthroughs in biology, chemistry, materials science, and physics. Organizers—including Harvard Ph.D. student Ada Fang—called the event a resounding success. Fang highlighted that while frontier AI for science spans multiple disciplines, the core challenges and ideas are shared, and the aim is to discuss both breakthroughs and limits.
Jeff Clune, a pioneer in applying AI to scientific research and a professor at the University of British Columbia, described an explosion of interest around AI-driven learning and discovery for science. He observed that the level of engagement today is unprecedented, with researchers from years past expressing renewed enthusiasm for solving some of humanity’s most pressing problems using AI. Clune called the shift heartwarming and hopeful, noting that AI’s capabilities have reached a point where tackling big problems feels within reach.
Jared Perlo reports on AI for NBC News, supported by the Tarbell Center for AI Journalism.