Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:45:31

Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs
Yining Hong, Huang Huang, Manling Li, Li Fei-Fei, Jiajun Wu, Yejin Choi
arxiv.org/abs/2602.21198 arxiv.org/pdf/2602.21198 arxiv.org/html/2602.21198
arXiv:2602.21198v1 Announce Type: new
Abstract: Embodied LLMs endow robots with high-level task reasoning, but they cannot reflect on what went wrong or why, turning deployment into a sequence of independent trials where mistakes repeat rather than accumulate into experience. Drawing upon human reflective practitioners, we introduce Reflective Test-Time Planning, which integrates two modes of reflection: \textit{reflection-in-action}, where the agent uses test-time scaling to generate and score multiple candidate actions using internal reflections before execution; and \textit{reflection-on-action}, which uses test-time training to update both its internal reflection model and its action policy based on external reflections after execution. We also include retrospective reflection, allowing the agent to re-evaluate earlier decisions and perform model updates with hindsight for proper long-horizon credit assignment. Experiments on our newly-designed Long-Horizon Household benchmark and MuJoCo Cupboard Fitting benchmark show significant gains over baseline models, with ablative studies validating the complementary roles of reflection-in-action and reflection-on-action. Qualitative analyses, including real-robot trials, highlight behavioral correction through reflection.
toXiv_bot_toot

@Dragofix@veganism.social
2026-02-25 23:40:53

Study reveals hidden climate impact of digital industries #climate

@UP8@mastodon.social
2026-02-25 16:01:41

🪤 France Bets on Carbon Capture as North Sea Rivals Surge Ahead
oilprice.com/Energy/Energy-Gen

@NFL@darktundra.xyz
2026-03-25 19:31:50

NFL proposes rule to assist possible replacement officials for 2026 season nytimes.com/athletic/7147157/2

@nemobis@mamot.fr
2026-03-25 14:25:32

The EU is looking for short fixes to reduce gas demand and has failed to find any.
euronews.com/my-europe/2026/03
China is ex…

@AprilTeachy@bildung.social
2026-02-26 15:15:00

Ui, digitale Unabhängigkeit nimmt auch woanders Fahrt auf!
Leave big tech behind! How to replace Amazon, Google, X, Meta, Apple – and more
theguardian.com/technology/202

@ubuntourist@mastodon.social
2026-02-26 18:37:38

Leave big tech behind! How to replace Amazon, Google, X, Meta, Apple – and more;
A handful of companies monopolize the web, with unprecedented access to our data. But there are many more ethical – and often distinctively European – alternatives
theg…

@NFL@darktundra.xyz
2026-03-25 18:16:25

NFL, referees union inch toward showdown as replacement officials loom

cbssports.com/nfl/news/nfl-ref

@NFL@darktundra.xyz
2026-03-26 19:10:28

Tom Brady reveals he previously looked into potential NFL comeback nfl.com/news/tom-brady-reveals

@NFL@darktundra.xyz
2026-03-26 18:02:12

Lavonte David's remarkable run, plus questions for Maxx Crosby, Puka Nacua and replacement refs?! nytimes.com/athletic/7149599/2