Spatial Signals | 7.20.2025
Weekly insights into the convergence of spatial computing + AI
Here’s this weeks ‘Spatial Signals’ — a snapshot of what matters most in Spatial Computing + AI.
This week’s TLDR:
Perk up and pay attention, because the world is changing. Fast.
First time reader? Sign up here.
(1) AI is starting to navigate changing environments, learning from both space AND time.
A new generation of AI models is emerging—ones that don’t just recognize images or summarize video, but understand how environments change, how actions unfold, and how space connects across time.
That’s the breakthrough behind ST-LLM, a spatio-temporal large language model designed to reason through 3D environments the way people do: by linking vision, motion, and language over time.
Instead of working with static images or flat videos, ST-LLM fuses point clouds, egocentric video, and natural language into a unified system. It doesn’t just see “what’s here now.” It understands how things move, how objects relate, and how actions play out in space.
The model was trained on a new dataset—REA (Reasoning about Environments and Actions)—with 25,000+ examples across 3D spaces and video, designed to test spatial-temporal reasoning through tasks like direction following, distance tracking, object finding, and action planning.
Performance is impressive. ST-LLM outperforms traditional video-language models by over 20% on complex spatial reasoning tasks—setting a new benchmark for embodied AI systems that need to operate in dynamic, open environments.
So what?
AI assistants today can describe a picture or summarize a video—but they can’t navigate your home, understand a space’s layout, or follow instructions like “Go around the table and stop at the window.”
This research pushes AI toward embodied intelligence—robots, assistants, and AR systems that can reason about environments through language, perception, and motion, not static maps or pre-scripted routes.
It signals a shift from reactive automation to context-aware collaboration. Spaces won’t be just mapped for machines—they’ll be understood. Navigation becomes conversational. Movement becomes intuitive.
For robotics and spatial computing, this unlocks smarter agents for homes, warehouses, hospitals, and cities. Systems that can adapt in real-time to dynamic environments and human intent.
The future of AI won’t be grounded in snapshots or grids—it will be grounded in space, time, and context.
Because when machines understand not just where they are—but how the world is changing—they stop reacting, and they start collaborating.
(2) Type a word, play a world: real-time, generative gameplay is no longer sci-fi.
A new generation of AI-native game engines is emerging—ones that don’t just render pre-built environments or follow scripted sequences, but generate entire worlds in real time, responding to language, movement, and imagination as they unfold.
That’s the breakthrough behind Mirage, from Dynamics Lab—a system designed to let players shape gameplay worlds through natural language and controller input, live, without mods, reboots, or reloads. Type “make it rainy” or “spawn neon hover bikes,” and the world shifts as you play. Cities transform. Tracks emerge. Environments evolve—all on demand.
Instead of relying on static assets or pre-defined scenes, Mirage fuses transformer-diffusion models trained on gameplay data with real-time engines. The result isn’t just responsive—it’s generative. Every session becomes a co-authored experience, blending player intention with AI improvisation. Worlds adapt in seconds, streamed straight to your browser. Early demos like Urban Chaos and Coastal Drift are already showing what's possible—minutes of fluid, evolving play at 16 FPS, built from scratch in response to user prompts.
The technology isn’t just about altering weather or spawning vehicles. It points to a broader shift in how we think about interactive media: from fixed environments to liquid, generative systems where content adapts as quickly as thought.
So what?
This isn’t just a demo—it marks the beginning of fluid, co-authored gameplay. Worlds are no longer static or scripted—they evolve with you.
For developers: Build systems, not stages. Think world-models, not static scenes.
For players: Your imagination becomes the interface. Every session is unique and driven by intent.
For storytellers: Narratives become malleable—adaptive, reactive, and infinitely replayable.
The future of games isn’t built ahead of time. It’s shaped as we play. And it’s not just interactive—it’s generative, emergent, and alive to your imagination.
(3) Flying Taxis Are Taking Off—With Toyota and Dubai Clearing the Runway
Joby Aviation has delivered its first electric air taxi to Dubai, marking a major milestone in the race to bring urban air mobility from concept to reality. This isn’t just a flashy PR moment—it’s a concrete step toward commercial operations slated to begin in 2026, under a six-year exclusive partnership with Dubai’s RTA.
The aircraft itself is fully electric, capable of 200 mph speeds and a 150-mile range, designed to operate from city-based vertiports. Regulatory progress is accelerating as well: Joby is advancing through FAA certification in the U.S., but Dubai’s regulatory embrace is giving it a critical early proving ground.
Behind the scenes, the unlikely booster here is Toyota. With nearly $750 million invested and a manufacturing partnership in place, Toyota is providing not just capital—but the operational expertise to help Joby scale efficiently. A production facility in Ohio is ramping up alongside test fleets in the U.S., with eyes on broader deployment in cities like New York and L.A.
So what?
Urban air mobility is moving from futuristic promise to operational reality—with Toyota’s industrial scale and Dubai’s regulatory backing clearing the runway.
For cities, this signals that infrastructure—vertiports, regulations, public-private partnerships—is catching up to the technology.
For investors, this marks a shift from speculative hype to tangible milestones.
For the industry, it proves that success isn’t just about aircraft—it’s about ecosystems: manufacturers, regulators, operators, and urban planners working in sync.
The era of flying taxis isn’t coming eventually. It’s taking off now.
Question for life…
What happens when machines stop processing the world as data—and begin to experience it as a living, unfolding present?
A new wave of AI is learning to see environments not as static snapshots, but as living, changing systems. It’s learning to understand how spaces shift, how objects move, and how actions unfold across time. This isn’t just a technical upgrade—it’s a shift in how machines begin to share our sense of reality.
Instead of reacting to pre-mapped environments or rigid coordinates, these systems reason through space the way people do: by linking what they see to how things move, and what might happen next. They don’t just recognize a chair—they understand it’s an obstacle to walk around. They don’t just see a door—they understand it leads somewhere. They navigate not through rules, but through context.
The impact on daily life will be subtle at first, but profound — assistants that guide us through unfamiliar buildings. Robots that adapt to cluttered homes or busy hospitals. AR systems that overlay useful information not just where we are—but where we’re going.
But the deeper shift is this: As machines begin to reason through space and time, they stop being tools we program and start becoming collaborators we work alongside. They move with us, understand our intent, and adapt as the world changes.
As such… the future of AI won’t feel like issuing commands to a machine.
It will feel like sharing space with something that inhabits the present—aware of the here and now, attuned to the flow of moments as they unfold.
Favorite Content of the Week
Article | What Happens After A.I. Destroys College Writing? “The demise of the English paper will end a long intellectual tradition, but it’s also an opportunity to re-examine the purpose of higher education.”