5 min read

The State of Large-Scale Foundation Models in Robotics

Our third Zurich Robotics Roundtable included Google DeepMind and Mimic Robotics.

Written by

Yves Albers

Published

January 28, 2026

Yesterday, Roboto, Founderful, and Google hosted the third Zurich Robotics Roundtable. The event builds on the success of our previous roundtable, which featured Raffaello D’Andrea - founder of Kiva Systems (now Amazon Robotics) and Verity, and Professor of Dynamic Systems and Control at ETH Zurich.

This time, over 100 roboticists, researchers, and founders from across Zurich gathered for a lively discussion on foundation models in robotics.

I had the privilege of moderating a fireside chat with Michael Neunert (Google DeepMind) and Elvis Nava (mimic), exploring where large-scale, end-to-end models are already delivering - and where reality still lags behind the hype.

A few clear themes emerged. Real-world data remains king: the highest-performing Vision-Language-Action (VLA) models still rely heavily on data collected from actual robots. It works, but it’s expensive, slow, and hard to scale. Video-action models are starting to change that. Leveraging pre-trained video models, like NVIDIA Robotics Cosmos-Predict2, can simplify data collection and begin to encode some structure of the physical world.

Even with these advances, models struggle with physics. Forces, friction, inertia, and even simple spatial reasoning - like determining whether a glass is left or right of an object - remain challenging. Robust world models and deeper physical grounding are still missing.

The energy in the room was palpable. Someone remarked, “I’ve never had to queue to get into a robotics event before - this feels more like a party.” Whether robotics is having its "ChatGPT moment" is up for debate, but it’s definitely having its Zurich moment.

For Roboto, with our teams split between Seattle and Zurich, it’s exciting to see both cities emerging as hotbeds for robotics. These roundtables aren’t just about sharing ideas - they’re about building a community and ecosystem that can drive robotics forward.

A huge thank you to Edo Treccani (Founderful) and Gianmaria Sbetta (Google) for co-organizing and sponsoring the event. We’re already looking forward to the next one.