Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csLG_bot@mastoxiv.page
2025-09-12 09:33:49

"A 6 or a 9?": Ensemble Learning Through the Multiplicity of Performant Models and Explanations
Gianlucca Zuin, Adriano Veloso
arxiv.org/abs/2509.09073

@arXiv_csAI_bot@mastoxiv.page
2025-09-12 09:40:09

The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
Akshit Sinha, Arvindh Arun, Shashwat Goel, Steffen Staab, Jonas Geiping
arxiv.org/abs/2509.09677

@arXiv_csCV_bot@mastoxiv.page
2025-08-12 12:47:43

OMGSR: You Only Need One Mid-timestep Guidance for Real-World Image Super-Resolution
Zhiqiang Wu, Zhaomang Sun, Tong Zhou, Bingtao Fu, Ji Cong, Yitong Dong, Huaqi Zhang, Xuan Tang, Mingsong Chen, Xian Wei
arxiv.org/abs/2508.08227

@tiotasram@kolektiva.social
2025-08-04 15:49:00

Should we teach vibe coding? Here's why not.
Should AI coding be taught in undergrad CS education?
1/2
I teach undergraduate computer science labs, including for intro and more-advanced core courses. I don't publish (non-negligible) scholarly work in the area, but I've got years of craft expertise in course design, and I do follow the academic literature to some degree. In other words, In not the world's leading expert, but I have spent a lot of time thinking about course design, and consider myself competent at it, with plenty of direct experience in what knowledge & skills I can expect from students as they move through the curriculum.
I'm also strongly against most uses of what's called "AI" these days (specifically, generative deep neutral networks as supplied by our current cadre of techbro). There are a surprising number of completely orthogonal reasons to oppose the use of these systems, and a very limited number of reasonable exceptions (overcoming accessibility barriers is an example). On the grounds of environmental and digital-commons-pollution costs alone, using specifically the largest/newest models is unethical in most cases.
But as any good teacher should, I constantly question these evaluations, because I worry about the impact on my students should I eschew teaching relevant tech for bad reasons (and even for his reasons). I also want to make my reasoning clear to students, who should absolutely question me on this. That inspired me to ask a simple question: ignoring for one moment the ethical objections (which we shouldn't, of course; they're very stark), at what level in the CS major could I expect to teach a course about programming with AI assistance, and expect students to succeed at a more technically demanding final project than a course at the same level where students were banned from using AI? In other words, at what level would I expect students to actually benefit from AI coding "assistance?"
To be clear, I'm assuming that students aren't using AI in other aspects of coursework: the topic of using AI to "help you study" is a separate one (TL;DR it's gross value is not negative, but it's mostly not worth the harm to your metacognitive abilities, which AI-induced changes to the digital commons are making more important than ever).
So what's my answer to this question?
If I'm being incredibly optimistic, senior year. Slightly less optimistic, second year of a masters program. Realistic? Maybe never.
The interesting bit for you-the-reader is: why is this my answer? (Especially given that students would probably self-report significant gains at lower levels.) To start with, [this paper where experienced developers thought that AI assistance sped up their work on real tasks when in fact it slowed it down] (arxiv.org/abs/2507.09089) is informative. There are a lot of differences in task between experienced devs solving real bugs and students working on a class project, but it's important to understand that we shouldn't have a baseline expectation that AI coding "assistants" will speed things up in the best of circumstances, and we shouldn't trust self-reports of productivity (or the AI hype machine in general).
Now we might imagine that coding assistants will be better at helping with a student project than at helping with fixing bugs in open-source software, since it's a much easier task. For many programming assignments that have a fixed answer, we know that many AI assistants can just spit out a solution based on prompting them with the problem description (there's another elephant in the room here to do with learning outcomes regardless of project success, but we'll ignore this over too, my focus here is on project complexity reach, not learning outcomes). My question is about more open-ended projects, not assignments with an expected answer. Here's a second study (by one of my colleagues) about novices using AI assistance for programming tasks. It showcases how difficult it is to use AI tools well, and some of these stumbling blocks that novices in particular face.
But what about intermediate students? Might there be some level where the AI is helpful because the task is still relatively simple and the students are good enough to handle it? The problem with this is that as task complexity increases, so does the likelihood of the AI generating (or copying) code that uses more complex constructs which a student doesn't understand. Let's say I have second year students writing interactive websites with JavaScript. Without a lot of care that those students don't know how to deploy, the AI is likely to suggest code that depends on several different frameworks, from React to JQuery, without actually setting up or including those frameworks, and of course three students would be way out of their depth trying to do that. This is a general problem: each programming class carefully limits the specific code frameworks and constructs it expects students to know based on the material it covers. There is no feasible way to limit an AI assistant to a fixed set of constructs or frameworks, using current designs. There are alternate designs where this would be possible (like AI search through adaptation from a controlled library of snippets) but those would be entirely different tools.
So what happens on a sizeable class project where the AI has dropped in buggy code, especially if it uses code constructs the students don't understand? Best case, they understand that they don't understand and re-prompt, or ask for help from an instructor or TA quickly who helps them get rid of the stuff they don't understand and re-prompt or manually add stuff they do. Average case: they waste several hours and/or sweep the bugs partly under the rug, resulting in a project with significant defects. Students in their second and even third years of a CS major still have a lot to learn about debugging, and usually have significant gaps in their knowledge of even their most comfortable programming language. I do think regardless of AI we as teachers need to get better at teaching debugging skills, but the knowledge gaps are inevitable because there's just too much to know. In Python, for example, the LLM is going to spit out yields, async functions, try/finally, maybe even something like a while/else, or with recent training data, the walrus operator. I can't expect even a fraction of 3rd year students who have worked with Python since their first year to know about all these things, and based on how students approach projects where they have studied all the relevant constructs but have forgotten some, I'm not optimistic seeing these things will magically become learning opportunities. Student projects are better off working with a limited subset of full programming languages that the students have actually learned, and using AI coding assistants as currently designed makes this impossible. Beyond that, even when the "assistant" just introduces bugs using syntax the students understand, even through their 4th year many students struggle to understand the operation of moderately complex code they've written themselves, let alone written by someone else. Having access to an AI that will confidently offer incorrect explanations for bugs will make this worse.
To be sure a small minority of students will be able to overcome these problems, but that minority is the group that has a good grasp of the fundamentals and has broadened their knowledge through self-study, which earlier AI-reliant classes would make less likely to happen. In any case, I care about the average student, since we already have plenty of stuff about our institutions that makes life easier for a favored few while being worse for the average student (note that our construction of that favored few as the "good" students is a large part of this problem).
To summarize: because AI assistants introduce excess code complexity and difficult-to-debug bugs, they'll slow down rather than speed up project progress for the average student on moderately complex projects. On a fixed deadline, they'll result in worse projects, or necessitate less ambitious project scoping to ensure adequate completion, and I expect this remains broadly true through 4-6 years of study in most programs (don't take this as an endorsement of AI "assistants" for masters students; we've ignored a lot of other problems along the way).
There's a related problem: solving open-ended project assignments well ultimately depends on deeply understanding the problem, and AI "assistants" allow students to put a lot of code in their file without spending much time thinking about the problem or building an understanding of it. This is awful for learning outcomes, but also bad for project success. Getting students to see the value of thinking deeply about a problem is a thorny pedagogical puzzle at the best of times, and allowing the use of AI "assistants" makes the problem much much worse. This is another area I hope to see (or even drive) pedagogical improvement in, for what it's worth.
1/2

@arXiv_csCY_bot@mastoxiv.page
2025-08-13 09:23:12

When the Domain Expert Has No Time and the LLM Developer Has No Clinical Expertise: Real-World Lessons from LLM Co-Design in a Safety-Net Hospital
Avni Kothari, Patrick Vossler, Jean Digitale, Mohammad Forouzannia, Elise Rosenberg, Michele Lee, Jennee Bryant, Melanie Molina, James Marks, Lucas Zier, Jean Feng
arxiv.org/abs/2508.085…

@UP8@mastodon.social
2025-09-11 16:31:52

🪛 Tricking LLM-Based NPCs into Spilling Secrets
#llm #ai

Image showing a graph of components,  on the left a player with arrows pointing to and from a box labeled game world;  the top link says "Prompt Injection" and has a picture with a bomb next to a UI box and "Revealed or Not?" which has chat bubbles with ... and a lock,  one with a red X on it and the other without it.  An NPC character is the game world together with an identical chat bubble with the other two that says NPC's secret,  below that is an icon for a command line prompt that says "S…
@arXiv_csRO_bot@mastoxiv.page
2025-06-13 08:29:30

Are We Generalizing from the Exception? An In-the-Wild Study on Group-Sensitive Conversation Design in Human-Agent Interactions
Ana M\"uller, Sabina Jeschke, Anja Richert
arxiv.org/abs/2506.10462

@arXiv_csCL_bot@mastoxiv.page
2025-09-12 09:40:39

Reading Between the Lines: Classifying Resume Seniority with Large Language Models
Matan Cohen, Shira Shani, Eden Menahem, Yehudit Aperstein, Alexander Apartsin
arxiv.org/abs/2509.09229

@arXiv_csSE_bot@mastoxiv.page
2025-06-13 08:09:30

MLLM-Based UI2Code Automation Guided by UI Layout Information
Fan Wu, Cuiyun Gao, Shuqing Li, Xin-Cheng Wen, Qing Liao
arxiv.org/abs/2506.10376

@arXiv_csSI_bot@mastoxiv.page
2025-09-12 08:39:19

The Role of Community Detection Methods in Performance Variations of Graph Mining Tasks
Shrabani Ghosh, Erik Saule
arxiv.org/abs/2509.09045

@arXiv_csHC_bot@mastoxiv.page
2025-08-12 09:15:12

AdjustAR: AI-Driven In-Situ Adjustment of Site-Specific Augmented Reality Content
Nels Numan, Jessica Van Brummelen, Ziwen Lu, Anthony Steed
arxiv.org/abs/2508.06826

@arXiv_csIR_bot@mastoxiv.page
2025-08-12 10:13:13

Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning
Yuqin Dai, Shuo Yang, Guoqing Wang, Yong Deng, Zhanwei Zhang, Jun Yin, Pengyu Zeng, Zhenzhe Ying, Changhua Meng, Can Yi, Yuchen Zhou, Weiqiang Wang, Shuai Lu
arxiv.org/abs/2508.07956

@arXiv_mathOC_bot@mastoxiv.page
2025-08-13 09:46:52

Optimization-Free Fast Optimal Control: Bang-Ride Property, Monotonicity, and Applications to Fast Battery Charging
Shengling Shi, Jacob Sass, Jiaen Wu, Minsu Kim, Yingjie Ma, Sungho Shin, Richard D. Braatz
arxiv.org/abs/2508.09010

@arXiv_csRO_bot@mastoxiv.page
2025-08-12 11:36:53

Grasp-HGN: Grasping the Unexpected
Mehrshad Zandigohar, Mallesham Dasari, Gunar Schirner
arxiv.org/abs/2508.07648 arxiv.org/pdf/2508.07648

@arXiv_csCV_bot@mastoxiv.page
2025-09-12 10:15:19

Measuring Epistemic Humility in Multimodal Large Language Models
Bingkui Tong, Jiaer Xia, Sifeng Shang, Kaiyang Zhou
arxiv.org/abs/2509.09658

@arXiv_csCL_bot@mastoxiv.page
2025-08-13 10:15:32

MVISU-Bench: Benchmarking Mobile Agents for Real-World Tasks by Multi-App, Vague, Interactive, Single-App and Unethical Instructions
Zeyu Huang, Juyuan Wang, Longfeng Chen, Boyi Xiao, Leng Cai, Yawen Zeng, Jin Xu
arxiv.org/abs/2508.09057

@arXiv_eessIV_bot@mastoxiv.page
2025-09-12 08:06:29

Generalized User-Oriented Image Semantic Coding Empowered by Large Vision-Language Model
Sin-Yu Huang, Vincent W. S. Wong
arxiv.org/abs/2509.08913

@arXiv_csCR_bot@mastoxiv.page
2025-08-11 09:47:09

Secure and Scalable Blockchain Voting: A Comparative Framework and the Role of Large Language Models
Kiana Kiashemshaki, Elvis Nnaemeka Chukwuani, Mohammad Jalili Torkamani, Negin Mahmoudi
arxiv.org/abs/2508.05865

@muz4now@mastodon.world
2025-08-08 14:38:03

Check how your mix sounds through multiple popular headphone models with Kali Audio’s new HP-1 Multi-Reference Headphones
#MusicTech #MusicianTips

@Techmeme@techhub.social
2025-08-07 17:19:27

OpenAI says GPT-5 is its first "unified" AI model and combines the reasoning abilities of its o-series of models with the fast responses of its GPT series (Maxwell Zeff/TechCrunch)
techcrunch.com/2025/08/07/open

@arXiv_csGT_bot@mastoxiv.page
2025-09-10 07:34:51

Persuading Agents in Opinion Formation Games
Martin Hoefer, Tim Koglin, Tolga Tel
arxiv.org/abs/2509.07520 arxiv.org/pdf/2509.07520

@arXiv_csCL_bot@mastoxiv.page
2025-09-12 09:41:09

Agentic LLMs for Question Answering over Tabular Data
Rishit Tyagi, Mohit Gupta, Rahul Bouri
arxiv.org/abs/2509.09234 arxiv.org/pdf/2509.09…

@arXiv_csCV_bot@mastoxiv.page
2025-09-12 10:13:19

Improving Human Motion Plausibility with Body Momentum
Ha Linh Nguyen, Tze Ho Elden Tse, Angela Yao
arxiv.org/abs/2509.09496 arxiv.org/pdf/…

@arXiv_quantph_bot@mastoxiv.page
2025-09-04 10:09:41

Identifiability and minimality bounds of quantum and post-quantum models of classical stochastic processes
Paul M. Riechers, Thomas J. Elliott
arxiv.org/abs/2509.03004

@arXiv_csHC_bot@mastoxiv.page
2025-08-13 08:38:12

AirSignatureDB: Exploring In-Air Signature Biometrics in the Wild and its Privacy Concerns
Marta Robledo-Moreno, Ruben Vera-Rodriguez, Ruben Tolosana, Javier Ortega-Garcia, Andres Huergo, Julian Fierrez
arxiv.org/abs/2508.08502

@arXiv_astrophHE_bot@mastoxiv.page
2025-09-08 08:54:00

Latest results from the searches for ultra-high-energy photons at the Pierre Auger Observatory
Pierpaolo Savina (for the Pierre Auger Collaboration)
arxiv.org/abs/2509.05113

@tiotasram@kolektiva.social
2025-07-30 17:56:35

Just read this post by @… on an optimistic AGI future, and while it had some interesting and worthwhile ideas, it's also in my opinion dangerously misguided, and plays into the current AGI hype in a harmful way.
social.coop/@eloquence/1149406
My criticisms include:
- Current LLM technology has many layers, but the biggest most capable models are all tied to corporate datacenters and require inordinate amounts of every and water use to run. Trying to use these tools to bring about a post-scarcity economy will burn up the planet. We urgently need more-capable but also vastly more efficient AI technologies if we want to use AI for a post-scarcity economy, and we are *not* nearly on the verge of this despite what the big companies pushing LLMs want us to think.
- I can see that permacommons.org claims a small level of expenses on AI equates to low climate impact. However, given current deep subsidies on place by the big companies to attract users, that isn't a great assumption. The fact that their FAQ dodges the question about which AI systems they use isn't a great look.
- These systems are not free in the same way that Wikipedia or open-source software is. To run your own model you need a data harvesting & cleaning operation that costs millions of dollars minimum, and then you need millions of dollars worth of storage & compute to train & host the models. Right now, big corporations are trying to compete for market share by heavily subsidizing these things, but it you go along with that, you become dependent on them, and you'll be screwed when they jack up the price to a profitable level later. I'd love to see open dataset initiatives SBD the like, and there are some of these things, but not enough yet, and many of the initiatives focus on one problem while ignoring others (fine for research but not the basis for a society yet).
- Between the environmental impacts, the horrible labor conditions and undercompensation of data workers who filter the big datasets, and the impacts of both AI scrapers and AI commons pollution, the developers of the most popular & effective LLMs have a lot of answer for. This project only really mentions environmental impacts, which makes me think that they're not serious about ethics, which in turn makes me distrustful of the whole enterprise.
- Their language also ends up encouraging AI use broadly while totally ignoring several entire classes of harm, so they're effectively contributing to AI hype, especially with such casual talk of AGI and robotics as if embodied AGI were just around the corner. To be clear about this point: we are several breakthroughs away from AGI under the most optimistic assumptions, and giving the impression that those will happen soon plays directly into the hands of the Sam Altmans of the world who are trying to make money off the impression of impending huge advances in AI capabilities. Adding to the AI hype is irresponsible.
- I've got a more philosophical criticism that I'll post about separately.
I do think that the idea of using AI & other software tools, possibly along with robotics and funded by many local cooperatives, in order to make businesses obsolete before they can do the same to all workers, is a good one. Get your local library to buy a knitting machine alongside their 3D printer.
Lately I've felt too busy criticizing AI to really sit down and think about what I do want the future to look like, even though I'm a big proponent of positive visions for the future as a force multiplier for criticism, and this article is inspiring to me in that regard, even if the specific project doesn't seem like a good one.

@arXiv_csCV_bot@mastoxiv.page
2025-09-12 10:15:59

SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
Jiahao Wang, Yufeng Yuan, Rujie Zheng, Youtian Lin, Jian Gao, Lin-Zhuo Chen, Yajie Bao, Yi Zhang, Chang Zeng, Yanxi Zhou, Xiaoxiao Long, Hao Zhu, Zhaoxiang Zhang, Xun Cao, Yao Yao
arxiv.org/abs/2509.09676

@primonatura@mstdn.social
2025-09-06 11:00:45

"Five forecasts early climate models got right—the evidence is all around you"
#Climate #ClimateChange

@arXiv_csCY_bot@mastoxiv.page
2025-09-08 07:46:39

The LLM Has Left The Chat: Evidence of Bail Preferences in Large Language Models
Danielle Ensign, Henry Sleight, Kyle Fish
arxiv.org/abs/2509.04781

@arXiv_csRO_bot@mastoxiv.page
2025-08-11 09:37:49

Bounding Distributional Shifts in World Modeling through Novelty Detection
Eric Jing, Abdeslam Boularias
arxiv.org/abs/2508.06096 arxiv.org…

@arXiv_eessSY_bot@mastoxiv.page
2025-08-07 08:41:54

Case Studies of Generative Machine Learning Models for Dynamical Systems
Nachiket U. Bapat, Randy C. Paffenroth, Raghvendra V. Cowlagi
arxiv.org/abs/2508.04459

@arXiv_csIR_bot@mastoxiv.page
2025-08-11 09:27:29

A Survey of LLM-based Deep Search Agents: Paradigm, Optimization, Evaluation, and Challenges
Yunjia Xi, Jianghao Lin, Yongzhao Xiao, Zheli Zhou, Rong Shan, Te Gao, Jiachen Zhu, Weiwen Liu, Yong Yu, Weinan Zhang
arxiv.org/abs/2508.05668

@arXiv_csSD_bot@mastoxiv.page
2025-07-08 08:42:40

Robust Localization of Partially Fake Speech: Metrics, Models, and Out-of-Domain Evaluation
Hieu-Thi Luong, Inbal Rimons, Haim Permuter, Kong Aik Lee, Eng Siong Chng
arxiv.org/abs/2507.03468

@denmanrooke@social.coop
2025-08-01 19:51:10

Studio Rúcach @… in collaboration with CREW are excited to announce a two-part masterclass exploring co-operative models for game dev.
Part 1: Navigating Co-Op Mode (Online)
Hear from co-ops creating equitable workplaces in the games industry.

Co-Create: Co-operative Business Models for the Games Sector. This 2-part masterclass introduces participants to the innovative potential of co-operative business models.

Part 1: Navigating Co-Op Mode. Hear from international game makers using co-op models to build sustainable, self-directed studios. Get real-world insights on how co-ownership and self-management can help build sustainable and equitable work environments.

27th of August 2025 15:00-16:30 (GMT) Online via Zoom.

Funded by Galwa…
@arXiv_csAR_bot@mastoxiv.page
2025-09-09 07:31:41

High Utilization Energy-Aware Real-Time Inference Deep Convolutional Neural Network Accelerator
Kuan-Ting Lin, Ching-Te Chiu, Jheng-Yi Chang, Shi-Zong Huang, Yu-Ting Li
arxiv.org/abs/2509.05688

@arXiv_csLG_bot@mastoxiv.page
2025-09-10 10:42:51

One Model for All Tasks: Leveraging Efficient World Models in Multi-Task Planning
Yuan Pu, Yazhe Niu, Jia Tang, Junyu Xiong, Shuai Hu, Hongsheng Li
arxiv.org/abs/2509.07945

@arXiv_csAI_bot@mastoxiv.page
2025-09-08 07:39:39

Language-Driven Hierarchical Task Structures as Explicit World Models for Multi-Agent Learning
Brennen Hill
arxiv.org/abs/2509.04731 arxiv.…

@arXiv_csCL_bot@mastoxiv.page
2025-09-12 09:50:19

GrACE: A Generative Approach to Better Confidence Elicitation in Large Language Models
Zhaohan Zhang, Ziquan Liu, Ioannis Patras
arxiv.org/abs/2509.09438

@arXiv_physicssocph_bot@mastoxiv.page
2025-09-10 09:01:41

From lines to networks
Marc Barthelemy
arxiv.org/abs/2509.07951 arxiv.org/pdf/2509.07951

@pbloem@sigmoid.social
2025-06-26 10:56:22

After training, we finetune on real-world data. We observe that the models that have been pre-trained with noise converge very quickly compared to a baseline which is trained from scratch.
Moreover, on the other datasets, the UP models retain their zero-shot performance during finetuning. This suggests that there may be a generalization benefit to using a UP model.
All this is at the expense of much longer training, but that cost can be amortized over many tasks.

The results for the finetuning experiment. Six datasets (linux, code, dyck, wp, german and ndfa) and the performance of four models: the baseline and UP trained models and two finetuning datasets. 

The results show that the UP models converge quicker, and that they retain most of their zero-shot performance on the other datasets.
@arXiv_eessAS_bot@mastoxiv.page
2025-08-13 08:43:32

LPGNet: A Lightweight Network with Parallel Attention and Gated Fusion for Multimodal Emotion Recognition
Zhining He, Yang Xiao
arxiv.org/abs/2508.08925

@arXiv_csGR_bot@mastoxiv.page
2025-09-09 09:44:02

Scaling Transformer-Based Novel View Synthesis Models with Token Disentanglement and Synthetic Data
Nithin Gopalakrishnan Nair, Srinivas Kaza, Xuan Luo, Vishal M. Patel, Stephen Lombardi, Jungyeon Park
arxiv.org/abs/2509.06950

@arXiv_mathNA_bot@mastoxiv.page
2025-07-08 11:52:41

When do World Models Successfully Learn Dynamical Systems?
Edmund Ross, Claudia Drygala, Leonhard Schwarz, Samir Kaiser, Francesca di Mare, Tobias Breiten, Hanno Gottschalk
arxiv.org/abs/2507.04898

@ErikJonker@mastodon.social
2025-07-05 07:35:51

More time should be devoted about the (near) future businessmodels of AI and how it collects data/content. Just trying to prevent AI models from scraping data will be futile.
blog.cloudflare.com/content-in

@arXiv_statME_bot@mastoxiv.page
2025-08-08 08:24:32

Goodness-of-fit test for multi-layer stochastic block models
Huan Qing
arxiv.org/abs/2508.04957 arxiv.org/pdf/2508.04957

@arXiv_csCR_bot@mastoxiv.page
2025-07-10 08:53:21

Attacker's Noise Can Manipulate Your Audio-based LLM in the Real World
Vinu Sankar Sadasivan, Soheil Feizi, Rajiv Mathews, Lun Wang
arxiv.org/abs/2507.06256

@arXiv_csIT_bot@mastoxiv.page
2025-08-05 08:51:00

Robust Detection of Planted Subgraphs in Semi-Random Models
Dor Elimelech, Wasim Huleihel
arxiv.org/abs/2508.02158 arxiv.org/pdf/2508.02158…

@arXiv_csSI_bot@mastoxiv.page
2025-07-11 07:36:51

Scalable Signed Exponential Random Graph Models under Local Dependence
Marc Schalberger, Cornelius Fritz
arxiv.org/abs/2507.07660

@arXiv_csSE_bot@mastoxiv.page
2025-08-08 08:25:52

Generative AI for Object-Oriented Programming: Writing the Right Code and Reasoning the Right Logic
Gang Xu, Airong Wang, Yushan Pan
arxiv.org/abs/2508.05005

@arXiv_csRO_bot@mastoxiv.page
2025-08-12 12:01:03

ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks
Kaijun Wang, Liqin Lu, Mingyu Liu, Jianuo Jiang, Zeju Li, Bolin Zhang, Wancai Zheng, Xinyi Yu, Hao Chen, Chunhua Shen
arxiv.org/abs/2508.08240

@arXiv_csCV_bot@mastoxiv.page
2025-07-11 10:19:21

Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling
Haoyu Wu, Diankun Wu, Tianyu He, Junliang Guo, Yang Ye, Yueqi Duan, Jiang Bian
arxiv.org/abs/2507.07982

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 09:07:01

LLM Analysis of 150 years of German Parliamentary Debates on Migration Reveals Shift from Post-War Solidarity to Anti-Solidarity in the Last Decade
Aida Kostikova, Ole P\"utz, Steffen Eger, Olga Sabelfeld, Benjamin Paassen
arxiv.org/abs/2509.07274

@arXiv_csLG_bot@mastoxiv.page
2025-09-11 10:13:03

Signal Fidelity Index-Aware Calibration for Dementia Predictions Across Heterogeneous Real-World Data
Jingya Cheng, Jiazi Tian, Federica Spoto, Alaleh Azhir, Daniel Mork, Hossein Estiri
arxiv.org/abs/2509.08679

@arXiv_csAI_bot@mastoxiv.page
2025-09-11 07:39:43

TCPO: Thought-Centric Preference Optimization for Effective Embodied Decision-making
Kechen Jiao, Zhirui Fang, Jiahao Liu, Bei Li, Qifan Wang, Xinyu Liu, Junhao Ruan, Zhongjian Qiao, Yifan Zhu, Yaxin Xu, Jingang Wang, Xiu Li
arxiv.org/abs/2509.08500

@arXiv_csCV_bot@mastoxiv.page
2025-09-12 10:11:29

Decoupling Clinical and Class-Agnostic Features for Reliable Few-Shot Adaptation under Shift
Umaima Rahman, Raza Imam, Mohammad Yaqub, Dwarikanath Mahapatra
arxiv.org/abs/2509.09397

@arXiv_physicssocph_bot@mastoxiv.page
2025-07-10 08:12:51

Modeling Multistability and Hysteresis in Urban Traffic Dynamics
Jung-Hoon Jung, Young-Ho Eom
arxiv.org/abs/2507.06659

@arXiv_csIR_bot@mastoxiv.page
2025-08-11 08:36:09

AI Guided Accelerator For Search Experience
Jayanth Yetukuri, Mehran Elyasi, Samarth Agrawal, Aritra Mandal, Rui Kong, Harish Vempati, Ishita Khan
arxiv.org/abs/2508.05649

@ErikJonker@mastodon.social
2025-06-24 13:25:24

"Israel and America’s war with Iran is not just a strategic throw of the dice. It is a fundamental shift in how power operates in a multipolar world. Though a transition toward new models of collective security remains theoretically possible, as the foundations of US global hegemony wither under Donald Trump the current trajectory favours entropy over order. The danger is not merely in acts of war, it is also in the chaos they leave behind."

@arXiv_csGR_bot@mastoxiv.page
2025-07-10 07:55:31

3D-Generalist: Self-Improving Vision-Language-Action Models for Crafting 3D Worlds
Fan-Yun Sun, Shengguang Wu, Christian Jacobsen, Thomas Yim, Haoming Zou, Alex Zook, Shangru Li, Yu-Hsin Chou, Ethem Can, Xunlei Wu, Clemens Eppner, Valts Blukis, Jonathan Tremblay, Jiajun Wu, Stan Birchfield, Nick Haber
arxiv.org/abs/2507.…

@arXiv_csSE_bot@mastoxiv.page
2025-08-06 09:21:50

MRG-Bench: Evaluating and Exploring the Requirements of Context for Repository-Level Code Generation
Haiyang Li
arxiv.org/abs/2508.02998 ar…

@arXiv_csCV_bot@mastoxiv.page
2025-07-11 10:19:11

Martian World Models: Controllable Video Synthesis with Physically Accurate 3D Reconstructions
Longfei Li, Zhiwen Fan, Wenyan Cong, Xinhang Liu, Yuyang Yin, Matt Foutter, Panwang Pan, Chenyu You, Yue Wang, Zhangyang Wang, Yao Zhao, Marco Pavone, Yunchao Wei
arxiv.org/abs/2507.07978

@arXiv_csAI_bot@mastoxiv.page
2025-09-08 07:56:50

What-If Analysis of Large Language Models: Explore the Game World Using Proactive Thinking
Yuan Sui, Yanming Zhang, Yi Liao, Yu Gu, Guohua Tang, Zhongqian Sun, Wei Yang, Bryan Hooi
arxiv.org/abs/2509.04791

@arXiv_csRO_bot@mastoxiv.page
2025-07-09 07:36:02

A Careful Examination of Large Behavior Models for Multitask Dexterous Manipulation
TRI LBM Team, Jose Barreiros, Andrew Beaulieu, Aditya Bhat, Rick Cory, Eric Cousineau, Hongkai Dai, Ching-Hsin Fang, Kunimatsu Hashimoto, Muhammad Zubair Irshad, Masha Itkina, Naveen Kuppuswamy, Kuan-Hui Lee, Katherine Liu, Dale McConachie, Ian McMahon, Haruki Nishimura, Calder Phillips-Grafflin, Charles Richter, Paarth Shah, Krishnan Srinivasan, Blake Wulfe, Chen Xu, Mengchao Zhang, Alex Alspach, Maya …

@arXiv_csCR_bot@mastoxiv.page
2025-07-09 07:41:32

A Systematization of Security Vulnerabilities in Computer Use Agents
Daniel Jones, Giorgio Severi, Martin Pouliot, Gary Lopez, Joris de Gruyter, Santiago Zanella-Beguelin, Justin Song, Blake Bullwinkel, Pamela Cortez, Amanda Minnich
arxiv.org/abs/2507.05445

@arXiv_csCV_bot@mastoxiv.page
2025-09-11 10:07:33

BcQLM: Efficient Vision-Language Understanding with Distilled Q-Gated Cross-Modal Fusion
Sike Xiang, Shuang Chen, Amir Atapour-Abarghouei
arxiv.org/abs/2509.08715

@arXiv_csLG_bot@mastoxiv.page
2025-07-09 10:03:32

Bit-Flip Fault Attack: Crushing Graph Neural Networks via Gradual Bit Search
Sanaz Kazemi Abharian, Sai Manoj Pudukotai Dinakarrao
arxiv.org/abs/2507.05531

@arXiv_csCL_bot@mastoxiv.page
2025-07-11 10:03:21

DocCHA: Towards LLM-Augmented Interactive Online diagnosis System
Xinyi Liu, Dachun Sun, Yi R. Fung, Dilek Hakkani-T\"ur, Tarek Abdelzaher
arxiv.org/abs/2507.07870

@arXiv_csAI_bot@mastoxiv.page
2025-09-08 09:39:00

LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation
Yinglin Duan, Zhengxia Zou, Tongwei Gu, Wei Jia, Zhan Zhao, Luyi Xu, Xinzhu Liu, Hao Jiang, Kang Chen, Shuang Qiu
arxiv.org/abs/2509.05263

@arXiv_csCY_bot@mastoxiv.page
2025-07-08 09:31:20

MateInfoUB: A Real-World Benchmark for Testing LLMs in Competitive, Multilingual, and Multimodal Educational Tasks
Dumitran Adrian Marius, Theodor-Pierre Moroianu, Buca Mihnea-Vicentiu
arxiv.org/abs/2507.03162

@arXiv_csGR_bot@mastoxiv.page
2025-08-08 08:28:22

A Study of the Framework and Real-World Applications of Language Embedding for 3D Scene Understanding
Mahmoud Chick Zaouali, Todd Charter, Yehor Karpichev, Brandon Haworth, Homayoun Najjjaran
arxiv.org/abs/2508.05064

@arXiv_csCV_bot@mastoxiv.page
2025-09-11 08:06:23

Two Stage Context Learning with Large Language Models for Multimodal Stance Detection on Climate Change
Lata Pangtey, Omkar Kabde, Shahid Shafi Dar, Nagendra Kumar
arxiv.org/abs/2509.08024

@arXiv_csSE_bot@mastoxiv.page
2025-09-09 11:08:52

Empirical Study of Code Large Language Models for Binary Security Patch Detection
Qingyuan Li, Binchang Li, Cuiyun Gao, Shuzheng Gao, Zongjie Li
arxiv.org/abs/2509.06052

@arXiv_csRO_bot@mastoxiv.page
2025-08-07 09:00:53

Incorporating Stochastic Models of Controller Behavior into Kinodynamic Efficiently Adaptive State Lattices for Mobile Robot Motion Planning in Off-Road Environments
Eric R. Damm, Eli S. Lancaster, Felix A. Sanchez, Kiana Bronder, Jason M. Gregory, Thomas M. Howard
arxiv.org/abs/2508.04384

@arXiv_csIR_bot@mastoxiv.page
2025-08-07 08:39:34

Enhancing Serendipity Recommendation System by Constructing Dynamic User Knowledge Graphs with Large Language Models
Qian Yong, Yanhui Li, Jialiang Shi, Yaguang Dou, Tian Qi
arxiv.org/abs/2508.04032

@arXiv_csAI_bot@mastoxiv.page
2025-09-08 07:36:09

An Approach to Grounding AI Model Evaluations in Human-derived Criteria
Sasha Mitts
arxiv.org/abs/2509.04676 arxiv.org/pdf/2509.04676

@arXiv_csIR_bot@mastoxiv.page
2025-07-08 09:06:50

Exploring the Effect of Context-Awareness and Popularity Calibration on Popularity Bias in POI Recommendations
Andrea Forster, Simone Kopeinik, Denic Helic, Stefan Thalmann, Dominik Kowald
arxiv.org/abs/2507.03503

@arXiv_csLG_bot@mastoxiv.page
2025-09-04 10:26:01

Rashomon in the Streets: Explanation Ambiguity in Scene Understanding
Helge Spieker, J{\o}rn Eirik Betten, Arnaud Gotlieb, Nadjib Lazaar, Nassim Belmecheri
arxiv.org/abs/2509.03169

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:19:41

VeriOS: Query-Driven Proactive Human-Agent-GUI Interaction for Trustworthy OS Agents
Zheng Wu, Heyuan Huang, Xingyu Lou, Xiangmou Qu, Pengzhou Cheng, Zongru Wu, Weiwen Liu, Weinan Zhang, Jun Wang, Zhaoxiang Wang, Zhuosheng Zhang
arxiv.org/abs/2509.07553

@arXiv_csAI_bot@mastoxiv.page
2025-09-05 09:55:31

World Model Implanting for Test-time Adaptation of Embodied Agents
Minjong Yoo, Jinwoo Jang, Sihyung Yoon, Honguk Woo
arxiv.org/abs/2509.03956

@arXiv_csCV_bot@mastoxiv.page
2025-09-10 10:45:41

One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation
Zheng Geng, Nan Wang, Shaocong Xu, Chongjie Ye, Bohan Li, Zhaoxi Chen, Sida Peng, Hao Zhao
arxiv.org/abs/2509.07978

@arXiv_csSE_bot@mastoxiv.page
2025-08-07 09:03:43

Experimental Analysis of Productive Interaction Strategy with ChatGPT: User Study on Function and Project-level Code Generation Tasks
Sangwon Hyun, Hyunjun Kim, Jinhyuk Jang, Hyojin Choi, M. Ali Babar
arxiv.org/abs/2508.04125

@arXiv_csIR_bot@mastoxiv.page
2025-07-08 11:39:10

Harnessing Pairwise Ranking Prompting Through Sample-Efficient Ranking Distillation
Junru Wu, Le Yan, Zhen Qin, Honglei Zhuang, Paul Suganthan G. C., Tianqi Liu, Zhe Dong, Xuanhui Wang, Harrie Oosterhuis
arxiv.org/abs/2507.04820

@arXiv_csRO_bot@mastoxiv.page
2025-09-09 10:58:22

Learning in ImaginationLand: Omnidirectional Policies through 3D Generative Models (OP-Gen)
Yifei Ren, Edward Johns
arxiv.org/abs/2509.06191

@arXiv_csCV_bot@mastoxiv.page
2025-07-11 10:18:11

Not Only Consistency: Enhance Test-Time Adaptation with Spatio-temporal Inconsistency for Remote Physiological Measurement
Xiao Yang, Yuxuan Fan, Can Liu, Houcheng Su, Weichen Guo, Jiyao Wang, Dengbo He
arxiv.org/abs/2507.07908

@arXiv_csCL_bot@mastoxiv.page
2025-07-08 13:42:41

Dual Modality-Aware Gated Prompt Tuning for Few-Shot Multimodal Sarcasm Detection
Soumyadeep Jana, Abhrajyoti Kundu, Sanasam Ranbir Singh
arxiv.org/abs/2507.04468

@arXiv_csSE_bot@mastoxiv.page
2025-09-03 09:49:33

Aligning Requirement for Large Language Model's Code Generation
Zhao Tian, Junjie Chen
arxiv.org/abs/2509.01313 arxiv.org/pdf/2509.0131…

@arXiv_csIR_bot@mastoxiv.page
2025-08-11 09:52:09

M2IO-R1: An Efficient RL-Enhanced Reasoning Framework for Multimodal Retrieval Augmented Multimodal Generation
Zhiyou Xiao, Qinhan Yu, Binghui Li, Geng Chen, Chong Chen, Wentao Zhang
arxiv.org/abs/2508.06328

@arXiv_csAI_bot@mastoxiv.page
2025-09-03 14:06:03

An Epidemiological Knowledge Graph extracted from the World Health Organization's Disease Outbreak News
Sergio Consoli, Pietro Coletti, Peter V. Markov, Lia Orfei, Indaco Biazzo, Lea Schuh, Nicolas Stefanovitch, Lorenzo Bertolini, Mario Ceresa, Nikolaos I. Stilianakis
arxiv.org/abs/2509.02258

@arXiv_csRO_bot@mastoxiv.page
2025-09-03 13:13:13

Constrained Decoding for Robotics Foundation Models
Parv Kapoor, Akila Ganlath, Changliu Liu, Sebastian Scherer, Eunsuk Kang
arxiv.org/abs/2509.01728

@arXiv_csCL_bot@mastoxiv.page
2025-09-05 10:11:31

On Robustness and Reliability of Benchmark-Based Evaluation of LLMs
Riccardo Lunardi, Vincenzo Della Mea, Stefano Mizzaro, Kevin Roitero
arxiv.org/abs/2509.04013

@arXiv_csAI_bot@mastoxiv.page
2025-08-06 09:37:00

Tree-of-Reasoning: Towards Complex Medical Diagnosis via Multi-Agent Reasoning with Evidence Tree
Qi Peng, Jialin Cui, Jiayuan Xie, Yi Cai, Qing Li
arxiv.org/abs/2508.03038

@arXiv_csRO_bot@mastoxiv.page
2025-08-07 09:23:04

Open Scene Graphs for Open-World Object-Goal Navigation
Joel Loo, Zhanxin Wu, David Hsu
arxiv.org/abs/2508.04678 arxiv.org/pdf/2508.04678…

@arXiv_csCL_bot@mastoxiv.page
2025-09-08 10:12:20

AFD-SLU: Adaptive Feature Distillation for Spoken Language Understanding
Yan Xie, Yibo Cui, Liang Xie, Erwei Yin
arxiv.org/abs/2509.04821 a…

@arXiv_csRO_bot@mastoxiv.page
2025-08-05 11:45:31

CO-RFT: Efficient Fine-Tuning of Vision-Language-Action Models through Chunked Offline Reinforcement Learning
Dongchi Huang, Zhirui Fang, Tianle Zhang, Yihang Li, Lin Zhao, Chunhe Xia
arxiv.org/abs/2508.02219

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:20:11

Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided Decomposition
Yi Liu, Xiangrong Zhu, Xiangyu Liu, Wei Wei, Wei Hu
arxiv.org/abs/2509.07555

@arXiv_csRO_bot@mastoxiv.page
2025-07-02 10:12:20

A Survey: Learning Embodied Intelligence from Physical Simulators and World Models
Xiaoxiao Long, Qingrui Zhao, Kaiwen Zhang, Zihao Zhang, Dingrui Wang, Yumeng Liu, Zhengjie Shu, Yi Lu, Shouzheng Wang, Xinzhe Wei, Wei Li, Wei Yin, Yao Yao, Jia Pan, Qiu Shen, Ruigang Yang, Xun Cao, Qionghai Dai
arxiv.org/abs/2507.00917

@arXiv_csCL_bot@mastoxiv.page
2025-07-10 10:03:41

VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation
Ziang Ye, Yang Zhang, Wentao Shi, Xiaoyu You, Fuli Feng, Tat-Seng Chua
arxiv.org/abs/2507.06899

@arXiv_csCV_bot@mastoxiv.page
2025-08-04 10:10:21

Can Large Pretrained Depth Estimation Models Help With Image Dehazing?
Hongfei Zhang, Kun Zhou, Ruizheng Wu, Jiangbo Lu
arxiv.org/abs/2508.00698

@arXiv_csCL_bot@mastoxiv.page
2025-07-10 09:55:01

CLI-RAG: A Retrieval-Augmented Framework for Clinically Structured and Context Aware Text Generation with LLMs
Garapati Keerthana, Manik Gupta
arxiv.org/abs/2507.06715