Today at #CHR2025, I will be presenting our work on the evaluation of the historical adequacy of masked language models (MLMs) for #Latin. There are several models like this, and they represent the current state of the art for a number of downstream tasks, like semantic change and text reuse detection. However, a h…
Documents: OpenAI is asking contractors to upload their work from current or previous jobs to evaluate its models, leaving it to them to scrub confidential info (Wired)
https://www.wired.com/story/openai-contractor-upload-real-work-documents-ai-agents/
Beyond Revenue and Welfare: Counterfactual Analysis of Spectrum Auctions with Application to Canada's 3800MHz Allocation
Sara Jalili Shani, Kris Joseph, Michael B. McNally, James R. Wright
https://arxiv.org/abs/2512.08106 https://arxiv.org/pdf/2512.08106 https://arxiv.org/html/2512.08106
arXiv:2512.08106v1 Announce Type: new
Abstract: Spectrum auctions are the primary mechanism through which governments allocate scarce radio frequencies, with outcomes that shape competition, coverage, and innovation in telecommunications markets. While traditional models of spectrum auctions often rely on strong equilibrium assumptions, we take a more parsimonious approach by modeling bidders as myopic and straightforward: in each round, firms simply demand the bundle that maximizes their utility given current prices. Despite its simplicity, this model proves effective in predicting the outcomes of Canada's 2023 auction of 3800 MHz spectrum licenses. Using detailed round-by-round bidding data, we estimate bidders' valuations through a linear programming framework and validate that our model reproduces key features of the observed allocation and price evolution. We then use these estimated valuations to simulate a counterfactual auction under an alternative mechanism that incentivizes deployment in rural and remote regions, aligning with one of the key objectives set out in the Canadian Telecommunications Act. The results show that the proposed mechanism substantially improves population coverage in underserved areas. These findings demonstrate that a behavioral model with minimal assumptions is sufficient to generate reliable counterfactual predictions, making it a practical tool for policymakers to evaluate how alternative auction designs may influence future outcomes. In particular, our study demonstrates a method for counterfactual mechanism design, providing a framework to evaluate how alternative auction rules could advance policy goals such as equitable deployment across Canada.
toXiv_bot_toot
LMArena, which runs a leaderboard ranking AI models based on their performance, raised $150M at a $1.7B post-money valuation, taking its total funding to $250M (The Information)
https://www.theinformation.com/articles/ai-evaluation-…
🎲 TextBandit: Evaluating Probabilistic Reasoning in LLMs Through Language-Only Decision Tasks
#llm
Can you guess which country's EV charging strategy prioritises market-based price signals to guide charging and discharging?
#V2G
The evolution of influence operations
from crude Russian troll farms to sophisticated AI systems using large language models;
the discovery of GoLaxy documents revealing a "Smart Propaganda System" that collects millions of data points daily, builds psychological profiles, and generates resilient personas;
the fundamental challenges of measuring effectiveness;
GoLaxy's ties to Chinese intelligence agencies;
operations targeting Hong Kong's…
A brilliant start to #FF2025 - Rachel's keynote, Megan and Amy's work on AI with the Bodleian and now Lucie Termignon, Simon Zilinskas on https://comparia.beta.gouv.fr/ - evaluate AI models by com…
Replaced article(s) found for cs.GT. https://arxiv.org/list/cs.GT/new
[1/1]:
- Cumulative Games: Who is the current player?
Urban Larsson, Reshef Meir, Yair Zick
https://arxiv.org/abs/2005.06326
- Contest Design with Threshold Objectives
Edith Elkind, Abheek Ghosh, Paul W. Goldberg
https://arxiv.org/abs/2109.03179
- Deep Learning Meets Mechanism Design: Key Results and Some Novel Applications
V. Udaya Sankar, Vishisht Srihari Rao, Y. Narahari
https://arxiv.org/abs/2401.05683 https://mastoxiv.page/@arXiv_csGT_bot/111741115483021453
- Charting the Shapes of Stories with Game Theory
Daskalakis, Gemp, Jiang, Leme, Papadimitriou, Piliouras
https://arxiv.org/abs/2412.05747 https://mastoxiv.page/@arXiv_csGT_bot/113627246220336424
- Computing Evolutionarily Stable Strategies in Multiplayer Games
Sam Ganzfried
https://arxiv.org/abs/2511.20859 https://mastoxiv.page/@arXiv_csGT_bot/115620508246637361
- Autodeleveraging: Impossibilities and Optimization
Tarun Chitra
https://arxiv.org/abs/2512.01112 https://mastoxiv.page/@arXiv_csGT_bot/115649040881525135
- Static Pricing Guarantees for Queueing Systems
Jacob Bergquist, Adam N. Elmachtoub
https://arxiv.org/abs/2305.09168 https://mastoxiv.page/@arXiv_csDS_bot/110382625621173269
- Game of arrivals at a two queue network with heterogeneous customer routes
Agniv Bandyopadhyay, Sandeep Juneja
https://arxiv.org/abs/2310.18149 https://mastoxiv.page/@arXiv_csPF_bot/111322112226936579
- Characterization of Priority-Neutral Matching Lattices
Clayton Thomas
https://arxiv.org/abs/2404.02142 https://mastoxiv.page/@arXiv_econTH_bot/112205968984928881
- Seven kinds of equivalent models for generalized coalition logics
Zixuan Chen, Fengkui Ju
https://arxiv.org/abs/2501.05466 https://mastoxiv.page/@arXiv_csLO_bot/113819715349259373
- Matching Markets Meet LLMs: Algorithmic Reasoning with Ranked Preferences
Hadi Hosseini, Samarth Khanna, Ronak Singh
https://arxiv.org/abs/2506.04478 https://mastoxiv.page/@arXiv_csAI_bot/114635186215388479
toXiv_bot_toot
This white paper provides a comprehensive analysis of modern warfare through five interconnected characteristics that have been prominently displayed throughout the Ukraine conflict:
- The rise of autonomous systems and their impact on force architecture
- The information domain as a critical battleground
- Electronic warfare and spectrum superiority
- The challenges of sustaining logistics in contested environments
- The evolution of air defense strategy
…