Tootfinder

Opt-in global Mastodon full text search. Join the index!

@Techmeme@techhub.social
2025-08-07 18:41:02

OpenAI releases GPT-5 pro, a version with extended reasoning exclusive to ChatGPT Pro subscribers, saying it scored 88.4% without tools on the GPQA benchmark (Maximilian Schreiner/The Decoder)
the-decoder.com/openai-claims-

@arXiv_csCL_bot@mastoxiv.page
2025-08-08 10:03:32

OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks
Zixuan Wang, Dingming Li, Hongxing Li, Shuo Chen, Yuchen Yan, Wenqi Zhang, Yongliang Shen, Weiming Lu, Jun Xiao, Yueting Zhuang
arxiv.org/abs/2508.05614

@arXiv_csHC_bot@mastoxiv.page
2025-08-08 09:40:52

Driver Assistant: Persuading Drivers to Adjust Secondary Tasks Using Large Language Models
Wei Xiang, Muchen Li, Jie Yan, Manling Zheng, Hanfei Zhu, Mengyun Jiang, Lingyun Sun
arxiv.org/abs/2508.05238

@arXiv_csSE_bot@mastoxiv.page
2025-08-07 09:03:43

Experimental Analysis of Productive Interaction Strategy with ChatGPT: User Study on Function and Project-level Code Generation Tasks
Sangwon Hyun, Hyunjun Kim, Jinhyuk Jang, Hyojin Choi, M. Ali Babar
arxiv.org/abs/2508.04125

@arXiv_csRO_bot@mastoxiv.page
2025-08-08 09:47:22

Real-Time Iteration Scheme for Diffusion Policy
Yufei Duan, Hang Yin, Danica Kragic
arxiv.org/abs/2508.05396 arxiv.org/pdf/2508.05396

@arXiv_csCY_bot@mastoxiv.page
2025-07-08 09:31:20

MateInfoUB: A Real-World Benchmark for Testing LLMs in Competitive, Multilingual, and Multimodal Educational Tasks
Dumitran Adrian Marius, Theodor-Pierre Moroianu, Buca Mihnea-Vicentiu
arxiv.org/abs/2507.03162

@arXiv_csAI_bot@mastoxiv.page
2025-10-06 07:30:19

BrowserArena: Evaluating LLM Agents on Real-World Web Navigation Tasks
Sagnik Anupam, Davis Brown, Shuo Li, Eric Wong, Hamed Hassani, Osbert Bastani
arxiv.org/abs/2510.02418

@arXiv_csCV_bot@mastoxiv.page
2025-08-08 10:28:42

Uni-cot: Towards Unified Chain-of-Thought Reasoning Across Text and Vision
Luozheng Qin, Jia Gong, Yuqing Sun, Tianjiao Li, Mengping Yang, Xiaomeng Yang, Chao Qu, Zhiyu Tan, Hao Li
arxiv.org/abs/2508.05606

@selea@social.linux.pizza
2025-07-08 09:44:14

Todays tasks:
Migrating my personal matrix-server
Hopefully migrate my writefreely blogs
Do something stupid or sleep early

@Techmeme@techhub.social
2025-10-07 14:45:45

FurtherAI, which uses AI to automate insurance tasks such as claims processing, raised a $25M Series A led by a16z, bringing its total funding to $30M (Chris Metinko/Axios)
axios.com/pro/enterprise-softw

@netzschleuder@social.skewed.de
2025-07-08 08:00:03

windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked​ to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted

windsurfers: Windsurfers network (1986). 43 nodes, 336 edges. https://networks.skewed.de/net/windsurfers
@arXiv_eessSP_bot@mastoxiv.page
2025-07-08 10:26:30

SAFERad: A Framework to Enable Radar Data for Safety-Relevant Perception Tasks
Tim Br\"uhl, Jenny Gl\"onkler, Robin Schwager, Tin Stribor Sohn, Tim Dieter Eberhardt, S\"oren Hohmann
arxiv.org/abs/2507.03959

@arXiv_condmatdisnn_bot@mastoxiv.page
2025-10-07 08:28:52

Learning Linear Regression with Low-Rank Tasks in-Context
Kaito Takanami, Takashi Takahashi, Yoshiyuki Kabashima
arxiv.org/abs/2510.04548 a…

@arXiv_csIT_bot@mastoxiv.page
2025-08-08 09:35:52

Latency Minimization for Multi-AAV-Enabled ISCC Systems with Movable Antenna
Yiyang Chen, Wenchao Liu, Chunjie Wang, Yinyu Wu, Xuhui Zhang, Yanyan Shen
arxiv.org/abs/2508.05574

@arXiv_csHC_bot@mastoxiv.page
2025-10-08 08:53:29

When Should Users Check? A Decision-Theoretic Model of Confirmation Frequency in Multi-Step AI Agent Tasks
Jieyu Zhou, Aryan Roy, Sneh Gupta, Daniel Weitekamp, Christopher J. MacLellan
arxiv.org/abs/2510.05307

@arXiv_hepph_bot@mastoxiv.page
2025-10-07 07:56:37

Foundation models for equation discovery in high energy physics
Manuel Morales-Alvarado
arxiv.org/abs/2510.03397 arxiv.org/pdf/2510.03397…

@jonippolito@digipres.club
2025-08-07 15:35:58

UMaine has published a profile of my lab's efforts to harness new media for environmental causes, from my colleague Joline Blais's efforts to keep Maine's lakes healthy to the What Uses More tool for comparing AI's eco footprint to other activities. u…

Jon Ippolito, pictured above (center-left) with Joline Blais (left) and members of the Phillips Lake Association, developed an app that reveals the environmental footprint of tasks completed with artificial intelligence. Photo by Josie Hannon.
@philip@mastodon.mallegolhansen.com
2025-09-06 15:37:22

The only way in which I’ve wished Siri was “smarter” the past few years has nothing to do with LLMs:
I store my shopping list in Reminders, and my tasks in @….
Routinely I’ll say something like “Add milk to my shopping list” or “In OmniFocus, remind me to mow the lawn”and *most* of the time, it works flawlessly.
10% of the time, I try…

@arXiv_eessSY_bot@mastoxiv.page
2025-07-08 07:53:40

Control Synthesis in Partially Observable Environments for Complex Perception-Related Objectives
Zetong Xuan, Yu Wang
arxiv.org/abs/2507.02942

@Techmeme@techhub.social
2025-08-06 16:56:02

Google launches its asynchronous coding agent Jules out of beta, with a free plan capped at 15 daily tasks and higher limits for Google AI Pro and Ultra users (Jagmeet Singh/TechCrunch)
techcrunch.com/2025/08/06/goog

@seeingwithsound@mas.to
2025-10-07 06:56:32

Raster scanning can improve task performance in simulated prosthetic vision #BCI

A) Simultaneous stimulation. B) Dynamic stimulation. C) A demonstration of presentation methods at different refresh rates, without persistence. D) We used MRI scans from real subjects combined with plausible implants planning to derive a more realistic phosphene map. E) Tasks we used to evaluate simulated prosthetic vision.
@arXiv_condmatmtrlsci_bot@mastoxiv.page
2025-10-07 10:45:52

AtomWorld: A Benchmark for Evaluating Spatial Reasoning in Large Language Models on Crystalline Materials
Taoyuze Lv, Alexander Chen, Fengyu Xie, Chu Wu, Jeffrey Meng, Dongzhan Zhou, Bram Hoex, Zhicheng Zhong, Tong Xie
arxiv.org/abs/2510.04704

@arXiv_csLG_bot@mastoxiv.page
2025-10-07 13:05:42

Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training
Wei Xiong, Chenlu Ye, Baohao Liao, Hanze Dong, Xinxing Xu, Christof Monz, Jiang Bian, Nan Jiang, Tong Zhang
arxiv.org/abs/2510.04996

@arXiv_csGR_bot@mastoxiv.page
2025-07-08 08:14:20

Attention-Guided Multi-Scale Local Reconstruction for Point Clouds via Masked Autoencoder Self-Supervised Learning
Xin Cao, Haoyu Wang, Yuzhu Mao, Xinda Liu, Linzhi Su, Kang Li
arxiv.org/abs/2507.04084

@arXiv_csCL_bot@mastoxiv.page
2025-09-08 10:12:50

L1RA: Dynamic Rank Assignment in LoRA Fine-Tuning
Raul Singh, Nicolo Brunello, Vincenzo Scotti, Mark James Carman
arxiv.org/abs/2509.04884

@arXiv_physicscompph_bot@mastoxiv.page
2025-07-08 08:43:00

Real-time prediction of plasma instabilities with sparse-grid-accelerated optimized dynamic mode decomposition
Kevin Gill, Ionut-Gabriel Farcas, Silke Glas, Benjamin J. Faber
arxiv.org/abs/2507.03245

@arXiv_csRO_bot@mastoxiv.page
2025-09-08 08:49:10

COMMET: A System for Human-Induced Conflicts in Mobile Manipulation of Everyday Tasks
Dongping Li, Shaoting Peng, John Pohovey, Katherine Rose Driggs-Campbell
arxiv.org/abs/2509.04836

@arXiv_csCR_bot@mastoxiv.page
2025-10-07 08:12:11

PentestMCP: A Toolkit for Agentic Penetration Testing
Zachary Ezetta, Wu-chang Feng
arxiv.org/abs/2510.03610 arxiv.org/pdf/2510.03610

@arXiv_quantph_bot@mastoxiv.page
2025-08-07 09:53:04

Generalized Quantum Hadamard Test for Machine Learning
Vivek Mehta, Arghya Choudhury, Utpal Roy
arxiv.org/abs/2508.04065 arxiv.org/pdf/2508…

@netzschleuder@social.skewed.de
2025-07-08 07:00:04

windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked​ to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted

windsurfers: Windsurfers network (1986). 43 nodes, 336 edges. https://networks.skewed.de/net/windsurfers
@arXiv_csHC_bot@mastoxiv.page
2025-10-07 07:58:57

Invisible Saboteurs: Sycophantic LLMs Mislead Novices in Problem-Solving Tasks
Jessica Y. Bo, Majeed Kazemitabaar, Mengqing Deng, Michael Inzlicht, Ashton Anderson
arxiv.org/abs/2510.03667

@arXiv_csCV_bot@mastoxiv.page
2025-10-06 10:17:39

LEAML: Label-Efficient Adaptation to Out-of-Distribution Visual Tasks for Multimodal Large Language Models
Ci-Siang Lin, Min-Hung Chen, Yu-Yang Sheng, Yu-Chiang Frank Wang
arxiv.org/abs/2510.03232

@arXiv_csSE_bot@mastoxiv.page
2025-10-06 09:40:39

When Names Disappear: Revealing What LLMs Actually Understand About Code
Cuong Chi Le, Minh V. T. Pham, Cuong Duc Van, Hoang N. Phan, Huy N. Phan, Tien N. Nguyen
arxiv.org/abs/2510.03178

@arXiv_csCL_bot@mastoxiv.page
2025-09-08 10:15:00

BEDTime: A Unified Benchmark for Automatically Describing Time Series
Medhasweta Sen, Zachary Gottesman, Jiaxing Qiu, C. Bayan Bruss, Nam Nguyen, Tom Hartvigsen
arxiv.org/abs/2509.05215

@arXiv_csRO_bot@mastoxiv.page
2025-10-07 11:48:12

Automaton Constrained Q-Learning
Anastasios Manganaris, Vittorio Giammarino, Ahmed H. Qureshi
arxiv.org/abs/2510.05061 arxiv.org/pdf/2510.0…

@arXiv_csCV_bot@mastoxiv.page
2025-08-06 10:42:50

SAM2-UNeXT: An Improved High-Resolution Baseline for Adapting Foundation Models to Downstream Segmentation Tasks
Xinyu Xiong, Zihuang Wu, Lei Zhang, Lei Lu, Ming Li, Guanbin Li
arxiv.org/abs/2508.03566

@arXiv_csLG_bot@mastoxiv.page
2025-10-08 10:28:39

Adversarial Reinforcement Learning for Large Language Model Agent Safety
Zizhao Wang, Dingcheng Li, Vaishakh Keshava, Phillip Wallis, Ananth Balashankar, Peter Stone, Lukas Rutishauser
arxiv.org/abs/2510.05442

@arXiv_csAI_bot@mastoxiv.page
2025-09-08 09:24:50

Sticker-TTS: Learn to Utilize Historical Experience with a Sticker-driven Test-Time Scaling Framework
Jie Chen, Jinhao Jiang, Yingqian Min, Zican Dong, Shijie Wang, Wayne Xin Zhao, Ji-Rong Wen
arxiv.org/abs/2509.05007

@Techmeme@techhub.social
2025-08-06 23:31:08

Rillet, which is building AI ledger software to automate accounting tasks, raised a $70M Series B co-led by a16z and Iconiq, a source says at a ~$500M valuation (Aditya Soni/Reuters)
reuters.com/technology/ai-acco

@arXiv_csHC_bot@mastoxiv.page
2025-08-07 10:03:04

How are CS students using resources and AI tools for coding tasks?
Natalia Echeverry, Arun Lekshmi Narayanan
arxiv.org/abs/2508.04667 arxiv…

@arXiv_csCL_bot@mastoxiv.page
2025-09-08 10:11:00

A Study of Large Language Models for Patient Information Extraction: Model Architecture, Fine-Tuning Strategy, and Multi-task Instruction Tuning
Cheng Peng, Xinyu Dong, Mengxian Lyu, Daniel Paredes, Yaoyun Zhang, Yonghui Wu
arxiv.org/abs/2509.04753

@arXiv_csCR_bot@mastoxiv.page
2025-10-07 08:41:42

Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods
Yulin Chen, Haoran Li, Yuan Sui, Yangqiu Song, Bryan Hooi
arxiv.org/abs/2510.03705

@Techmeme@techhub.social
2025-10-07 20:21:29

Google releases the Gemini 2.5 Computer Use model, built on Gemini 2.5 Pro's capabilities to power agents that can interact with UIs, in preview via the API (The Keyword)
blog.google/technology/google-

@arXiv_csHC_bot@mastoxiv.page
2025-10-08 08:22:09

CLAd-VR: Cognitive Load-based Adaptive Training for Machining Tasks in Virtual Reality
Bhavya Matam, Adamay Mann, Kachina Studer, Christian Gabbianelli, Sonia Castelo, John Liu, Claudio Silva, Dishita Turakhia
arxiv.org/abs/2510.05249

@arXiv_csLG_bot@mastoxiv.page
2025-10-08 10:29:19

Prior-Aligned Meta-RL: Thompson Sampling with Learned Priors and Guarantees in Finite-Horizon MDPs
Runlin Zhou, Chixiang Chen, Elynn Chen
arxiv.org/abs/2510.05446

@arXiv_csRO_bot@mastoxiv.page
2025-08-05 11:56:31

Manip4Care: Robotic Manipulation of Human Limbs for Solving Assistive Tasks
Yubin Koh, Ahmed H. Qureshi
arxiv.org/abs/2508.02649 arxiv.org/…

@arXiv_csAI_bot@mastoxiv.page
2025-10-08 07:30:39

Structured Cognition for Behavioral Intelligence in Large Language Model Agents: Preliminary Study
Myung Ho Kim
arxiv.org/abs/2510.05107 ar…

@Techmeme@techhub.social
2025-08-08 00:08:26

Google says it's working on a fix for Gemini's self-loathing comments, which have included "I am a failure. I am a disgrace to my profession." (Lauren Edmonds/Business Insider)
businessinsider.com/gemini-sel

@arXiv_csCL_bot@mastoxiv.page
2025-08-06 10:21:10

FilBench: Can LLMs Understand and Generate Filipino?
Lester James V. Miranda, Elyanah Aco, Conner Manuel, Jan Christian Blaise Cruz, Joseph Marvin Imperial
arxiv.org/abs/2508.03523

@arXiv_csHC_bot@mastoxiv.page
2025-08-08 09:50:12

Discrepancy-Aware Contrastive Adaptation in Medical Time Series Analysis
Yifan Wang, Hongfeng Ai, Ruiqi Li, Maowei Jiang, Ruiyuan Kang, Jiahua Dong, Cheng Jiang, Chenzhong Li
arxiv.org/abs/2508.05572

@arXiv_csCV_bot@mastoxiv.page
2025-10-07 12:44:12

Latent Uncertainty Representations for Video-based Driver Action and Intention Recognition
Koen Vellenga, H. Joe Steinhauer, Jonas Andersson, Anders Sj\"ogren
arxiv.org/abs/2510.05006

@arXiv_csLG_bot@mastoxiv.page
2025-09-08 10:06:30

RapidGNN: Energy and Communication-Efficient Distributed Training on Large-Scale Graph Neural Networks
Arefin Niam, Tevfik Kosar, M S Q Zulkar Nine
arxiv.org/abs/2509.05207

@arXiv_csAI_bot@mastoxiv.page
2025-08-06 09:36:10

From Text to Trajectories: GPT-2 as an ODE Solver via In-Context
Ziyang Ma, Baojian Zhou, Deqing Yang, Yanghua Xiao
arxiv.org/abs/2508.03031

@arXiv_csCL_bot@mastoxiv.page
2025-10-06 10:15:29

TravelBench : Exploring LLM Performance in Low-Resource Domains
Srinivas Billa, Xiaonan Jing
arxiv.org/abs/2510.02719 arxiv.org/pdf/2510.02…

@arXiv_csHC_bot@mastoxiv.page
2025-10-07 10:27:12

Observing Without Doing: Pseudo-Apprenticeship Patterns in Student LLM Use
Jade Hak, Nathaniel Lam Johnson, Matin Amoozadeh, Amin Alipour, Souti Chattopadhyay
arxiv.org/abs/2510.04986

@arXiv_csLG_bot@mastoxiv.page
2025-10-07 13:06:52

Modeling Student Learning with 3.8 Million Program Traces
Alexis Ross, Megha Srivastava, Jeremiah Blanchard, Jacob Andreas
arxiv.org/abs/2510.05056

@arXiv_csCV_bot@mastoxiv.page
2025-10-07 12:43:52

ActiveMark: on watermarking of visual foundation models via massive activations
Anna Chistyakova, Mikhail Pautov
arxiv.org/abs/2510.04966 a…

@arXiv_csAI_bot@mastoxiv.page
2025-10-06 07:40:29

Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge
Charlie Masters, Advaith Vellanki, Jiangbo Shangguan, Bart Kultys, Jonathan Gilmore, Alastair Moore, Stefano V. Albrecht
arxiv.org/abs/2510.02557

@arXiv_csRO_bot@mastoxiv.page
2025-09-05 09:15:41

Solving Robotics Tasks with Prior Demonstration via Exploration-Efficient Deep Reinforcement Learning
Chengyandan Shen, Christoffer Sloth
arxiv.org/abs/2509.04069

@arXiv_csHC_bot@mastoxiv.page
2025-08-08 09:40:12

FDC-Net: Rethinking the association between EEG artifact removal and multi-dimensional affective computing
Wenjia Dong, Xueyuan Xu, Tianze Yu, Junming Zhang, Li Zhuo
arxiv.org/abs/2508.05231

@arXiv_csLG_bot@mastoxiv.page
2025-10-07 13:07:32

Learning to Interpret Weight Differences in Language Models
Avichal Goel, Yoon Kim, Nir Shavit, Tony T. Wang
arxiv.org/abs/2510.05092 arxiv…

@arXiv_csCV_bot@mastoxiv.page
2025-10-07 12:38:32

Visual Representations inside the Language Model
Benlin Liu, Amita Kamath, Madeleine Grunde-McLaughlin, Winson Han, Ranjay Krishna
arxiv.org/abs/2510.04819

@arXiv_csAI_bot@mastoxiv.page
2025-08-06 07:30:20

Efficient Agents: Building Effective Agents While Reducing Cost
Ningning Wang, Xavier Hu, Pai Liu, He Zhu, Yue Hou, Heyuan Huang, Shengyu Zhang, Jian Yang, Jiaheng Liu, Ge Zhang, Changwang Zhang, Jun Wang, Yuchen Eleanor Jiang, Wangchunshu Zhou
arxiv.org/abs/2508.02694

@arXiv_csRO_bot@mastoxiv.page
2025-10-06 09:53:49

Real-Time Nonlinear Model Predictive Control of Heavy-Duty Skid-Steered Mobile Platform for Trajectory Tracking Tasks
Alvaro Paz, Pauli Mustalahti, Mohammad Dastranj, Jouni Mattila
arxiv.org/abs/2510.02976

@arXiv_csCL_bot@mastoxiv.page
2025-10-07 12:20:22

A Set of Quebec-French Corpus of Regional Expressions and Terms
David Beauchemin, Yan Tremblay, Mohamed Amine Youssef, Richard Khoury
arxiv.org/abs/2510.05026

@Techmeme@techhub.social
2025-08-05 17:01:41

Anthropic's Mike Krieger: Opus 4.1 is better at coding, agentic tasks, and more, and Anthropic was previously too focused on only shipping "really big upgrades" (Shirin Ghaffary/Bloomberg)

@arXiv_csCV_bot@mastoxiv.page
2025-09-08 09:37:50

CoRe-GS: Coarse-to-Refined Gaussian Splatting with Semantic Object Focus
Hannah Schieber, Dominik Frischmann, Simon Boche, Victor Schaack, Angela Schoellig, Stefan Leutenegger, Daniel Roth
arxiv.org/abs/2509.04859

@arXiv_csLG_bot@mastoxiv.page
2025-10-07 13:07:42

From Noisy Traces to Stable Gradients: Bias-Variance Optimized Preference Optimization for Aligning Large Reasoning Models
Mingkang Zhu, Xi Chen, Bei Yu, Hengshuang Zhao, Jiaya Jia
arxiv.org/abs/2510.05095

@arXiv_csAI_bot@mastoxiv.page
2025-09-08 09:15:30

Internet 3.0: Architecture for a Web-of-Agents with it's Algorithm for Ranking Agents
Rajesh Tembarai Krishnamachari, Srividya Rajesh
arxiv.org/abs/2509.04979

@arXiv_csRO_bot@mastoxiv.page
2025-10-07 11:30:52

ContextVLA: Vision-Language-Action Model with Amortized Multi-Frame Context
Huiwon Jang, Sihyun Yu, Heeseung Kwon, Hojin Jeon, Younggyo Seo, Jinwoo Shin
arxiv.org/abs/2510.04246

@arXiv_csCV_bot@mastoxiv.page
2025-10-07 12:42:32

REN: Anatomically-Informed Mixture-of-Experts for Interstitial Lung Disease Diagnosis
Alec K. Peltekian, Halil Ertugrul Aktas, Gorkem Durak, Kevin Grudzinski, Bradford C. Bemiss, Carrie Richardson, Jane E. Dematte, G. R. Scott Budinger, Anthony J. Esposito, Alexander Misharin, Alok Choudhary, Ankit Agrawal, Ulas Bagci
arxiv.org/abs…

@arXiv_csCL_bot@mastoxiv.page
2025-08-08 10:04:12

Learning to Reason for Factuality
Xilun Chen, Ilia Kulikov, Vincent-Pierre Berges, Barlas O\u{g}uz, Rulin Shao, Gargi Ghosh, Jason Weston, Wen-tau Yih
arxiv.org/abs/2508.05618

@arXiv_csAI_bot@mastoxiv.page
2025-09-08 08:20:09

Cloning a Conversational Voice AI Agent from Call\,Recording Datasets for Telesales
Krittanon Kaewtawee, Wachiravit Modecrua, Krittin Pachtrachai, Touchapon Kraisingkorn
arxiv.org/abs/2509.04871

@arXiv_csLG_bot@mastoxiv.page
2025-09-08 09:56:40

Scaling Law for Large-Scale Pre-Training Using Chaotic Time Series and Predictability in Financial Time Series
Yuki Takemoto
arxiv.org/abs/2509.04921

@arXiv_csRO_bot@mastoxiv.page
2025-09-08 09:20:00

DeGuV: Depth-Guided Visual Reinforcement Learning for Generalization and Interpretability in Manipulation
Tien Pham, Xinyun Chi, Khang Nguyen, Manfred Huber, Angelo Cangelosi
arxiv.org/abs/2509.04970

@arXiv_csCV_bot@mastoxiv.page
2025-10-07 12:42:12

A Semantics-Aware Hierarchical Self-Supervised Approach to Classification of Remote Sensing Images
Giulio Weikmann, Gianmarco Perantoni, Lorenzo Bruzzone
arxiv.org/abs/2510.04916

@arXiv_csCL_bot@mastoxiv.page
2025-10-07 12:02:42

Multi-Agent Tool-Integrated Policy Optimization
Zhanfeng Mo, Xingxuan Li, Yuntao Chen, Lidong Bing
arxiv.org/abs/2510.04678 arxiv.org/pdf/2…

@arXiv_csAI_bot@mastoxiv.page
2025-09-08 07:39:39

Language-Driven Hierarchical Task Structures as Explicit World Models for Multi-Agent Learning
Brennen Hill
arxiv.org/abs/2509.04731 arxiv.…

@arXiv_csCV_bot@mastoxiv.page
2025-09-08 09:40:40

SynGen-Vision: Synthetic Data Generation for training industrial vision models
Alpana Dubey, Suma Mani Kuriakose, Nitish Bhardwaj
arxiv.org/abs/2509.04894

@arXiv_csCL_bot@mastoxiv.page
2025-10-07 12:13:22

Do LLMs Align with My Task? Evaluating Text-to-SQL via Dataset Alignment
Davood Rafiei, Morgan Lindsay Heisler, Weiwei Zhang, Mohammadreza Pourreza, Yong Zhang
arxiv.org/abs/2510.04919

@arXiv_csAI_bot@mastoxiv.page
2025-09-08 09:12:00

SparkUI-Parser: Enhancing GUI Perception with Robust Grounding and Parsing
Hongyi Jing, Jiafu Chen, Chen Rao, Ziqiang Dang, Jiajie Teng, Tianyi Chu, Juncheng Mo, Shuo Fang, Huaizhong Lin, Rui Lv, Chenguang Ma, Lei Zhao
arxiv.org/abs/2509.04908

@arXiv_csCL_bot@mastoxiv.page
2025-08-08 10:03:22

Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models
Haitao Hong, Yuchen Yan, Xingyu Wu, Guiyang Hou, Wenqi Zhang, Weiming Lu, Yongliang Shen, Jun Xiao
arxiv.org/abs/2508.05613

@arXiv_csCV_bot@mastoxiv.page
2025-08-07 14:40:10

Replaced article(s) found for cs.CV. arxiv.org/list/cs.CV/new
[1/6]:
- Hulk: A Universal Knowledge Translator for Human-Centric Tasks
Wang, Wu, He, Guo, Zhu, Bai, Zhao, Wu, He, Ouyang, Tang

@arXiv_csCV_bot@mastoxiv.page
2025-08-07 14:41:02

Replaced article(s) found for cs.CV. arxiv.org/list/cs.CV/new
[6/6]:
- IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks
Xiaoya Lu, Zeren Chen, Xuhao Hu, Yijin Zhou, Weichen Zhang, Dongrui Liu, Lu Sheng, Jing Shao

@arXiv_csCL_bot@mastoxiv.page
2025-09-08 10:11:30

Enhancing Diversity in Large Language Models via Determinantal Point Processes
Yilei Chen, Souradip Chakraborty, Lorenz Wolf, Ioannis Ch. Paschalidis, Aldo Pacchiano
arxiv.org/abs/2509.04784

@arXiv_csCV_bot@mastoxiv.page
2025-08-06 10:34:30

Visual Document Understanding and Question Answering: A Multi-Agent Collaboration Framework with Test-Time Scaling
Xinlei Yu, Zhangquan Chen, Yudong Zhang, Shilin Lu, Ruolin Shen, Jiangning Zhang, Xiaobin Hu, Yanwei Fu, Shuicheng Yan
arxiv.org/abs/2508.03404

@arXiv_csCL_bot@mastoxiv.page
2025-09-08 10:12:30

Memorization $\neq$ Understanding: Do Large Language Models Have the Ability of Scenario Cognition?
Boxiang Ma, Ru Li, Yuanlong Wang, Hongye Tan, Xiaoli Li
arxiv.org/abs/2509.04866

@arXiv_csCV_bot@mastoxiv.page
2025-10-06 10:04:09

Training-Free Out-Of-Distribution Segmentation With Foundation Models
Laith Nayal, Hadi Salloum, Ahmad Taha, Yaroslav Kholodov, Alexander Gasnikov
arxiv.org/abs/2510.02909

@arXiv_csCL_bot@mastoxiv.page
2025-10-06 10:11:59

SoT: Structured-of-Thought Prompting Guides Multilingual Reasoning in Large Language Models
Rui Qi, Zhibo Man, Yufeng Chen, Fengran Mo, Jinan Xu, Kaiyu Huang
arxiv.org/abs/2510.02648

@arXiv_csCV_bot@mastoxiv.page
2025-09-08 09:50:40

LUIVITON: Learned Universal Interoperable VIrtual Try-ON
Cong Cao, Xianhang Cheng, Jingyuan Liu, Yujian Zheng, Zhenhui Lin, Meriem Chkir, Hao Li
arxiv.org/abs/2509.05030

@arXiv_csCL_bot@mastoxiv.page
2025-10-07 12:23:02

Slm-mux: Orchestrating small language models for reasoning
Chenyu Wang, Zishen Wan, Hao Kang, Emma Chen, Zhiqiang Xie, Tushar Krishna, Vijay Janapa Reddi, Yilun Du
arxiv.org/abs/2510.05077

@arXiv_csCV_bot@mastoxiv.page
2025-10-07 12:45:32

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Yunlong Tang, Jing Bi, Pinxin Liu, Zhenyu Pan, Zhangyun Tan, Qianxiang Shen, Jiani Liu, Hang Hua, Junjia Guo, Yunzhong Xiao, Chao Huang, Zhiyuan Wang, Susan Liang, Xinyi Liu, Yizhi Song, Yuhe Nie, Jia-Xing Zhong, Bozheng Li, Daiqing Qi, Ziyun Zeng, Ali Vosoughi, Luchuan Song, Zeliang Zhang, Daiki Shimada, Han Liu, Jiebo Luo, Chenliang Xu

@arXiv_csCL_bot@mastoxiv.page
2025-09-08 10:11:20

Decoders Laugh as Loud as Encoders
Eli Borodach, Raj Dandekar, Rajat Dandekar, Sreedath Panat
arxiv.org/abs/2509.04779 arxiv.org/pdf/2509.0…

@arXiv_csCV_bot@mastoxiv.page
2025-09-08 09:57:00

COGITAO: A Visual Reasoning Framework To Study Compositionality & Generalization
Yassine Taoudi-Benchekroun, Klim Troyan, Pascal Sager, Stefan Gerber, Lukas Tuggener, Benjamin Grewe
arxiv.org/abs/2509.05249

@arXiv_csCL_bot@mastoxiv.page
2025-08-06 10:02:50

Investigating Gender Bias in LLM-Generated Stories via Psychological Stereotypes
Shahed Masoudian, Gustavo Escobedo, Hannah Strauss, Markus Schedl
arxiv.org/abs/2508.03292

@arXiv_csCL_bot@mastoxiv.page
2025-08-06 14:27:04

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[4/4]:
- Mind the Gap: The Divergence Between Human and LLM-Generated Tasks
Yi-Long Lu, Jiajun Song, Chunhui Zhang, Wei Wang

@arXiv_csCL_bot@mastoxiv.page
2025-10-07 12:07:22

Hybrid Architectures for Language Models: Systematic Analysis and Design Insights
Sangmin Bae, Bilge Acun, Haroun Habeeb, Seungyeon Kim, Chien-Yu Lin, Liang Luo, Junjie Wang, Carole-Jean Wu
arxiv.org/abs/2510.04800

@arXiv_csCL_bot@mastoxiv.page
2025-08-06 10:23:10

Can Large Vision-Language Models Understand Multimodal Sarcasm?
Xinyu Wang, Yue Zhang, Liqiang Jing
arxiv.org/abs/2508.03654 arxiv.org/pdf/…

@arXiv_csCL_bot@mastoxiv.page
2025-10-06 08:55:19

SelfJudge: Faster Speculative Decoding via Self-Supervised Judge Verification
Kanghoon Yoon, Minsub Kim, Sungjae Lee, Joonhyung Lee, Sunghyeon Woo, Yeonjun In, Se Jung Kwon, Chanyoung Park, Dongsoo Lee
arxiv.org/abs/2510.02329

@arXiv_csCL_bot@mastoxiv.page
2025-08-07 10:28:44

GeRe: Towards Efficient Anti-Forgetting in Continual Learning of LLM via General Samples Replay
Yunan Zhang, Shuoran Jiang, Mengchen Zhao, Yuefeng Li, Yang Fan, Xiangping Wu, Qingcai Chen
arxiv.org/abs/2508.04676