Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@brian_gettler@mas.to
2026-01-09 13:16:19

Here's a good chunk of the assigned books for my grad seminar on Canadian history - a few golden oldies, but mostly newer stuff. I've got a good group and am looking forward to a fine term.
#histodons

10 books standing upright on a wooden desk, spines toward thr camera. From left to right: Loo, Moved by thr State; Luby, Dammed; Wright, Donald Creighton: A Life in History; Wickwire, At the Bridge; Berger, The Sense of Power; Owram, Promise of Eden; Hamon, The Audacity of His Enterprise; Lewis, Nerbas, Shaw, McGill in History; Young, Patrician Families and the Making of Quebec; and Dechêne, Power and Subsistence.
@Techmeme@techhub.social
2026-02-25 23:46:03

New York's AG sues Valve over its use of loot boxes, accusing the game developer of violating state gambling laws and threatening to addict children to gambling (Jonathan Stempel/Reuters)
reuters.com/legal/government/n

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:37:21

Probing Dec-POMDP Reasoning in Cooperative MARL
Kale-ab Tessera, Leonard Hinckeldey, Riccardo Zamboni, David Abel, Amos Storkey
arxiv.org/abs/2602.20804 arxiv.org/pdf/2602.20804 arxiv.org/html/2602.20804
arXiv:2602.20804v1 Announce Type: new
Abstract: Cooperative multi-agent reinforcement learning (MARL) is typically framed as a decentralised partially observable Markov decision process (Dec-POMDP), a setting whose hardness stems from two key challenges: partial observability and decentralised coordination. Genuinely solving such tasks requires Dec-POMDP reasoning, where agents use history to infer hidden states and coordinate based on local information. Yet it remains unclear whether popular benchmarks actually demand this reasoning or permit success via simpler strategies. We introduce a diagnostic suite combining statistically grounded performance comparisons and information-theoretic probes to audit the behavioural complexity of baseline policies (IPPO and MAPPO) across 37 scenarios spanning MPE, SMAX, Overcooked, Hanabi, and MaBrax. Our diagnostics reveal that success on these benchmarks rarely requires genuine Dec-POMDP reasoning. Reactive policies match the performance of memory-based agents in over half the scenarios, and emergent coordination frequently relies on brittle, synchronous action coupling rather than robust temporal influence. These findings suggest that some widely used benchmarks may not adequately test core Dec-POMDP assumptions under current training paradigms, potentially leading to over-optimistic assessments of progress. We release our diagnostic tooling to support more rigorous environment design and evaluation in cooperative MARL.
toXiv_bot_toot