2025-10-01 11:33:07
AI Playing Business Games: Benchmarking Large Language Models on Managerial Decision-Making in Dynamic Simulations
Berdymyrat Ovezmyradov
https://arxiv.org/abs/2509.26331 https:…
AI Playing Business Games: Benchmarking Large Language Models on Managerial Decision-Making in Dynamic Simulations
Berdymyrat Ovezmyradov
https://arxiv.org/abs/2509.26331 https:…
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[3/5]:
- RS-OOD: A Vision-Language Augmented Framework for Out-of-Distribution Detection in Remote Sensing
Chenhao Wang, Yingrui Ji, Yu Meng, Yunjian Zhang, Yao Zhu
Convergence and Divergence of Language Models under Different Random Seeds
Finlay Fehlauer (ETH Zurich), Kyle Mahowald (University of Texas at Austin), Tiago Pimentel (ETH Zurich)
https://arxiv.org/abs/2509.26643
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[5/5]:
- HAMLET: Switch your Vision-Language-Action Model into a History-Aware Policy
Myungkyu Koo, Daewon Choi, Taeyoung Kim, Kyungmin Lee, Changyeon Kim, Younggyo Seo, Jinwoo Shin
"Very sad day for crow enthusiasts" says #pastagang
SPADE: A Large Language Model Framework for Soil Moisture Pattern Recognition and Anomaly Detection in Precision Agriculture
Yeonju Lee, Rui Qi Chen, Joseph Oboamah, Po Nien Su, Wei-zhen Liang, Yeyin Shi, Lu Gan, Yongsheng Chen, Xin Qiao, Jing Li
https://arxiv.org/abs/2509.18123
Synergizing Static Analysis with Large Language Models for Vulnerability Discovery and beyond
Vaibhav Agrawal, Kiarash Ahi
https://arxiv.org/abs/2509.15433 https://
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[5/7]:
- CoFFT: Chain of Foresight-Focus Thought for Visual Language Models
Zhang, Dong, Zhang, Jia, Dang, Fernando, Liu, Shou
I am really interested in the linked.art initiative, but I’m finding it hard to wrap my head around it. What strikes me as particularly odd:
* The types-of-types pattern, which creates JSON structures that are very unlike usual JSON properties.
* The AATization of everything, including things like language tags, for which perfectly fine native RDF patterns exist.
Has anyone worked with it? Are there good Getting Started guides?
Man I hate the new language for a task one doesn't want to do — "aversive”.
Primary reason is that it externalizes the problem. It's fixed mindset framing, that something _else_ has to change to make the situation better, that it's the task's fault for being aversive, instead of our own for not working on the aversion to the needed task.
We have a bad habit lately of attributing relational traits to the party in a relationship, rather than to the relationship itself and it weakens our thinking about things. Noticing this pattern really changes how you think, because you can start attributing things better and noticing the contexts — and contexts are often things that can be changed!
My brilliant (almost) 13-year-old is learning about verbal nouns and adjectives in his language studies class
This morning I discovered that he pronounces "gerund" with the stress pattern of "Gerard"* not "Jared"
And now I'm doubting *my own* pronunciation of that word
* His choice is almost certainly influenced by his current favorite rocker Gerard Way
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[2/7]:
- LFTR: Learning-Free Token Reduction for Multimodal Large Language Models
Zihui Zhao, Yingxin Li, Yang Li
TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Yi Han, Cheng Chi, Enshen Zhou, Shanyu Rong, Jingkun An, Pengwei Wang, Zhongyuan Wang, Lu Sheng, Shanghang Zhang
https://arxiv.org/abs/2510.07181
Automating Code Generation for Semiconductor Equipment Control from Developer Utterances with LLMs
Youngkyoung Kim, Sanghyeok Park, Misoo Kim, Gangho Yoon, Eunseok Lee, Simon S. Woo
https://arxiv.org/abs/2509.13055
this should keep yous out of trouble for a while :luna_moth: 🎹
#liveCoding
Finally, what Xia & Lindell call a "separation problem" is, in our view, a feature of our approach and not a bug.
If, e.g., all languages in a family are polysynthetic (or none are), that’s not a statistical artefact – it’s the signal. The outcome is well associated with genealogy, showing that family membership captures someth genuinely informative about the process. When the model finds that family explains a large share of the variance, that's not a failure–it's evidence that phylogenetic structure dominates the pattern.
So while Xia & Lindell insist that "autocorrelation due to relationships and distance cannot be captured in family or regional-level analyses", we see that as an empirical question – and we treated it as one.
The real test is whether a mixed model that explicitly represents phylogeny and geography performs worse than their alternative, where the entire shared history of languages and environments is effectively collapsed into a single dimension (an eigenvector).
In other words: we model relationships – Xia & Lindell summarise them into one number per language.
Language-Instructed Reasoning for Group Activity Detection via Multimodal Large Language Model
Jihua Peng, Qianxiong Xu, Yichen Liu, Chenxi Liu, Cheng Long, Rui Zhao, Ziyue Li
https://arxiv.org/abs/2509.16054
Reasoning Pattern Matters: Learning to Reason without Human Rationales
Chaoxu Pang, Yixuan Cao, Ping Luo
https://arxiv.org/abs/2510.12643 https://arxiv.org…
Crosslisted article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[1/2]:
- SGAligner : Cross-Modal Language-Aided 3D Scene Graph Alignment
Binod Singh, Sayan Deb Sarkar, Iro Armeni
Poseidon: A OneGraph Engine
Brad Bebee, \"Umit V. \c{C}ataly\"urek, Olaf Hartig, Ankesh Khandelwal, Simone Rondelli, Michael Schmidt, Lefteris Sidirourgos, Bryan Thompson
https://arxiv.org/abs/2510.11166
Talking with Oompa Loompas: A novel framework for evaluating linguistic acquisition of LLM agents
Sankalp Tattwadarshi Swain, Anshika Krishnatray, Dhruv Kumar, Jagat Sesh Challa
https://arxiv.org/abs/2509.07389
i made some nice satdee morning tune for yous
Crosslisted article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[2/2]:
- CaTS-Bench: Can Language Models Describe Numeric Time Series?
Luca Zhou, Pratham Yashwante, Marshall Fisher, Alessio Sampieri, Zihao Zhou, Fabio Galasso, Rose Yu
Architecting Resilient LLM Agents: A Guide to Secure Plan-then-Execute Implementations
Ron F. Del Rosario, Klaudia Krawiecka, Christian Schroeder de Witt
https://arxiv.org/abs/2509.08646
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[4/5]:
- Investigating Traffic Accident Detection Using Multimodal Large Language Models
Ilhan Skender, Kailin Tong, Selim Solmaz, Daniel Watzenig
Understanding Economic Tradeoffs Between Human and AI Agents in Bargaining Games
Crystal Qian, Kehang Zhu, John Horton, Benjamin S. Manning, Vivian Tsai, James Wexler, Nithum Thain
https://arxiv.org/abs/2509.09071
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[2/8]:
- SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlea...
Chen, Deng, Zheng, Yan, Liu, Wu, Jiang, Liu, Hu
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[8/8]:
- The Better You Learn, The Smarter You Prune: Towards Efficient Vision-language-action Models via ...
Jiang, Jiang, Ma, Wen, Li, Zhan, Jia, Liu, Sun, Lang
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[4/4]:
- Evaluating the Robustness of Open-Source Vision-Language Models to Domain Shift in Object Captioning
Federico Tavella, Amber Drinkwater, Angelo Cangelosi
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[1/5]:
- Video-based Sign Language Recognition without Temporal Segmentation
Jie Huang, Wengang Zhou, Qilin Zhang, Houqiang Li, Weiping Li
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[5/5]:
- TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Han, Chi, Zhou, Rong, An, Wang, Wang, Sheng, Zhang
Crosslisted article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[1/1]:
- Sample-efficient Integration of New Modalities into Large Language Models
Osman Batur \.Ince, Andr\'e F. T. Martins, Oisin Mac Aodha, Edoardo M. Ponti
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[3/7]:
- AutoDrive-QA: A Multiple-Choice Benchmark for Vision-Language Evaluation in Urban Autonomous Driving
Boshra Khalili, Andrew W. Smyth