Physical Intelligence Just Introduced Robots That Learn on the Job

Physical Intelligence describes how its new π*0.6 model adapts on hardware using a RL method called Recap

A robot runs an espresso machine for thirteen hours straight in a lab. It folds mixed laundry at a rate of ~3 minutes per item, and assembles boxes at ~2.5 minutes per unit. These runs were posted by Physical Intelligence (PI) as evidence of its new model, π*0.6, which the company says “more than doubles throughput over a base model trained without RL”.

If accurate, this is among the clearest claims yet of robots that learn and improve via real-world experience rather than just being pre-programmed. The effort matches PI’s longstanding aim: “general purpose robots doing general purpose tasks in the real world.” according to co-founder Brian Ichter at Robotics Fellows Forum 2025.

This update builds on two years of work at the company and sits at the center of its effort to develop general-purpose robotic intelligence.

How PI Collected Its First Real-World Data

Physical Intelligence was founded in early 2024 by researchers including Brian Ichter, Karol Hausman, Chelsea Finn and Sergey Levine. The company describes its mission as building foundation models and learning algorithms “to bring general-purpose AI into the physical world.”

In November 2024 PI announced a US$400 million funding round at a US$2 billion valuation, with backers including Jeff Bezos and OpenAI.

Early work focused on collecting large volumes of data and constructing a generalist robot policy architecture. As Ichter described: “We started with a large open-source dataset and then we collected a lot of data through teleoperation.”

Through that work the company built the initial model, π₀, then π₀.5, which could perform tasks in new homes. The capability progression sets the stage for π*0.6’s claim of self-improvement.

Inside the π*0.6 Release

In the company’s claim is that π*0.6 uses a reinforcement-learning method dubbed Recap, in which a value function trained on all previous data scores actions; the robot executes higher-scored actions, collects further on-robot experience, and includes teleoperator corrections for major mistakes.

PI says this loop “more than doubles throughput” on its hardest tasks and “roughly halves the task failure rate”. Videos show: an espresso-machine run ~13 hours; laundry folding for ~3 hours at ~3 minutes per item; box assembly ~1 hour at ~2.5 minutes per box.

In the keynote, Ichter said: “We’re still a bit away from these things really working in useful ways in the real world… performance is a big issue here, safety is a big issue.”

PI did not publish detailed teleoperator-intervention logs, hardware diversity metrics or full failure-case disclosures. These omissions matter because the claims rest on throughput and generalisation, both of which depend heavily on environment, hardware and supervision.

Meanwhile the company had previously addressed execution issues: in June 2025 it published work on Real-Time Chunking (RTC), a method to maintain smooth robot motion under inference latency by overlapping action chunks.

That work complements the RL loop by addressing real-world constraints of latency and control.

The Broader Robotics Field

The robotics industry is active. Skild AI, backed by Amazon and SoftBank, announced a general-purpose model for robots trained via simulation and real data.

Figure AI and Covariant have disclosed generalist robotics efforts. But most public efforts emphasise supervised or imitation learning rather than reinforcement learning on physical robots with published throughput gains.

PI’s claims differ in emphasising on-robot learning and measured throughput improvements. But the question is how well that holds up across unseen hardware, unfamiliar settings and commercial deployment.

In his keynote, Ichter said: “There’s just so much diversity in the world… even a single home you realise how much complexity there really is.”

The company’s leadership has detailed remaining constraints. In their paper on RTC they wrote: “Modern AI systems increasingly require real-time performance. However, the high latency of state-of-the-art generalist models… poses a significant challenge.” 

“We want robots that perceive the world and put some action out… we want to be at a point where we actually understand the physical embodiment of each one… and imbue intelligence to each of these,” says Brian Ichter

📣 Want to advertise in AIM Media House? Book here >

Picture of Mukundan Sivaraj
Mukundan Sivaraj
Mukundan covers enterprise AI and the AI startup ecosystem for AIM Media House. Reach out to him at mukundan.sivaraj@aimmediahouse.com or Signal at mukundan.42.
Global leaders, intimate gatherings, bold visions for AI.
CDO Vision is a premier, year-round networking initiative connecting top Chief
Data Officers (CDOs) & Enterprise AI Leaders across major cities worldwide.

Subscribe to our Newsletter: AIM Research’s most stimulating intellectual contributions on matters molding the future of AI and Data.