Probing Dec-POMDP Reasoning in Cooperative MARL
Kale-ab Tessera, Leonard Hinckeldey, Riccardo Zamboni, David Abel, Amos Storkey
https://arxiv.org/abs/2602.20804 https://arxiv.org/pdf/2602.20804 https://arxiv.org/html/2602.20804
arXiv:2602.20804v1 Announce Type: new
Abstract: Cooperative multi-agent reinforcement learning (MARL) is typically framed as a decentralised partially observable Markov decision process (Dec-POMDP), a setting whose hardness stems from two key challenges: partial observability and decentralised coordination. Genuinely solving such tasks requires Dec-POMDP reasoning, where agents use history to infer hidden states and coordinate based on local information. Yet it remains unclear whether popular benchmarks actually demand this reasoning or permit success via simpler strategies. We introduce a diagnostic suite combining statistically grounded performance comparisons and information-theoretic probes to audit the behavioural complexity of baseline policies (IPPO and MAPPO) across 37 scenarios spanning MPE, SMAX, Overcooked, Hanabi, and MaBrax. Our diagnostics reveal that success on these benchmarks rarely requires genuine Dec-POMDP reasoning. Reactive policies match the performance of memory-based agents in over half the scenarios, and emergent coordination frequently relies on brittle, synchronous action coupling rather than robust temporal influence. These findings suggest that some widely used benchmarks may not adequately test core Dec-POMDP assumptions under current training paradigms, potentially leading to over-optimistic assessments of progress. We release our diagnostic tooling to support more rigorous environment design and evaluation in cooperative MARL.
toXiv_bot_toot
"This is one aspect of what campaigners call “process as punishment”, an approach that now dominates the treatment of protest groups. Even if you are never convicted of a crime, your life is made hell if you dare, visibly and publicly, to dissent." -- @…
I can verify from the experience o…
“My father warned us, ‘When evil men plot, good men must plan. When evil men burn and bomb, good men must build and bind,’” Bernice King, the daughter of the Rev. Dr. Martin Luther King Jr., wrote of Pretti’s murder. “What we are witnessing now (masked raids, people taken without due process, vigilante, Gestapo, and slave patrol-like tactics, normalized under the color of law) is a moral crisis.”
#Trump