Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:21:07

WebChoreArena: Evaluating Web Browsing Agents on Realistic Tedious Web Tasks
Atsuyuki Miyai, Zaiying Zhao, Kazuki Egashira, Atsuki Sato, Tatsumi Sunada, Shota Onohara, Hiromasa Yamanishi, Mashiro Toyooka, Kunato Nishina, Ryoma Maeda, Kiyoharu Aizawa, Toshihiko Yamasaki
arxiv.org/abs/2506.01952

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 07:48:55

LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks
Yi Yang, Jiaxuan Sun, Siqi Kou, Yihan Wang, Zhijie Deng
arxiv.org/abs/2506.00411

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:21:03

Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs
Chenjun Xu, Bingbing Wen, Bin Han, Robert Wolfe, Lucy Lu Wang, Bill Howe
arxiv.org/abs/2506.00582

@arXiv_csCV_bot@mastoxiv.page
2025-07-03 10:32:10

How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks
Rahul Ramachandran, Ali Garjani, Roman Bachmann, Andrei Atanov, O\u{g}uzhan Fatih Kar, Amir Zamir
arxiv.org/abs/2507.01955

@jonippolito@digipres.club
2025-06-02 13:58:09

LLMs often nail short-term tasks but AI agents can spiral into insane meltdowns over time. This AI-powered vending machine tried to email the FBI to say it was a victim of cyberfraud and then declared its quantum state "collapsed":
linked…

A cartoonish vending machine explodes with soda cans and lights
@arXiv_csHC_bot@mastoxiv.page
2025-07-02 10:04:20

Designing Visualization Widgets for Tangible Data Exploration: A Systematic Review
Haonan Yao, Lingyun Yu, Lijie Yao
arxiv.org/abs/2507.00775

@azonenberg@ioc.exchange
2025-08-01 04:52:10

Google calendar notifications are not cutting it... Anybody have suggestions on a better "organize all the stuff you have to / want to do" tool?
I'm not even quite sure what I want, other than "tasks that sat around for a year uncompleted should not auto-delete" and "tasks should be able to block other tasks".
I guess the "easy" option is a private github repo that is empty and only used as an issue tracker, but then I'd have to sig…

@arXiv_csIR_bot@mastoxiv.page
2025-07-02 09:36:09

WebArXiv: Evaluating Multimodal Agents on Time-Invariant arXiv Tasks
Zihao Sun, Meng Fang, Ling Chen
arxiv.org/abs/2507.00938

@Techmeme@techhub.social
2025-06-03 03:40:43

A software developer explains how LLMs, especially AI agents, help automate tedious coding tasks and addresses concerns like hallucinations, job loss, and more (Thomas Ptacek/Fly)
fly.io/blog/youre-all-nuts/

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 07:45:39

Haptic Rapidly-Exploring Random Trees: A Sampling-based Planner for Quasi-static Manipulation Tasks
Lin Yang, Huu-Thiet Nguyen, Donghan Yu, Chen Lv, Domenico Campolo
arxiv.org/abs/2506.00351

@arXiv_econEM_bot@mastoxiv.page
2025-06-03 07:25:33

Can AI Master Econometrics? Evidence from Econometrics AI Agent on Expert-Level Tasks
Qiang Chen, Tianyang Han, Jin Li, Ye Luo, Yuxiao Wu, Xiaowei Zhang, Tuo Zhou
arxiv.org/abs/2506.00856

@arXiv_csSE_bot@mastoxiv.page
2025-06-02 10:03:45

This arxiv.org/abs/2503.19217 has been replaced.
initial toot: mastoxiv.page/@arXiv_csSE_…

@arXiv_csMM_bot@mastoxiv.page
2025-06-02 09:58:19

This arxiv.org/abs/2505.06685 has been replaced.
initial toot: mastoxiv.page/@arXiv_csMM_…

@arXiv_csCL_bot@mastoxiv.page
2025-07-02 10:13:00

Transferable Modeling Strategies for Low-Resource LLM Tasks: A Prompt and Alignment-Based
Shuangquan Lyu, Yingnan Deng, Guiran Liu, Zhen Qi, Ruotong Wang
arxiv.org/abs/2507.00601

@servelan@newsie.social
2025-07-31 19:04:51

" About half of all AI-generated code contains security flaws.
Veracode tasked over 100 different large language models with completing 80 separate coding tasks, from using different coding languages to building different types of applications. The results were not exactly inspiring if security is your top priority, with just 55% of tasks completed ultimately generating “secure” code."
Read This Before You Trust Any AI-Written Code
gizmodo.com/read-this-before-y

@arXiv_econGN_bot@mastoxiv.page
2025-06-02 10:00:38

This arxiv.org/abs/2405.20912 has been replaced.
initial toot: mastoxiv.page/@arXiv_eco…

@tante@tldr.nettime.org
2025-06-26 17:35:54

I am planning on moving my shit away from Notion.
The whole "2nd Brain" thing does not work for me (it just gives me busywork instead of doing something useful) so what I am looking for
- self-hostable
- should have a web interface I can use from any machine without installation
-I need "tasks", "projects" that contain tasks and "notes" (ideally optionally connected to projects and/or tasks)
- boards for tasks
- mobile …

@arXiv_csLG_bot@mastoxiv.page
2025-07-31 09:20:21

SourceSplice: Source Selection for Machine Learning Tasks
Ambarish Singh, Romila Pradhan
arxiv.org/abs/2507.22186 arxiv.org/pdf/2507.22186

@arXiv_qbioNC_bot@mastoxiv.page
2025-07-03 08:31:50

Reduced Efficiency in the Right Fronto-Parietal Attentional Network During Distractor Suppression in Mild Cognitive Impairment
Jatupong Oboun, Piyanon Charoenpoonpanich, Anna Raksapatcharawong, Chaipat Chunharas, Itthi Chatnuntawech, Chainarong Amornbunchornvej, Sirawaj Itthipuripat
arxiv.org/abs/2507.01433

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 16:10:09

This arxiv.org/abs/2406.13945 has been replaced.
initial toot: mastoxiv.page/@arXiv_csAI_…

@Techmeme@techhub.social
2025-08-02 04:10:51

Manus unveils Wide Research, an experimental feature that lets users on its Pro plan enlist dozens of parallelized AI agents for large-scale, high-volume tasks (Carl Franzen/VentureBeat)
venturebeat.com/ai/youv…

@arXiv_qbioQM_bot@mastoxiv.page
2025-08-01 08:40:01

Lightweight Language Models are Prone to Reasoning Errors for Complex Computational Phenotyping Tasks
Sarah Pungitore, Shashank Yadav, David Maughan, Vignesh Subbian
arxiv.org/abs/2507.23146

@arXiv_csCR_bot@mastoxiv.page
2025-06-02 07:17:42

LLM Agents Should Employ Security Principles
Kaiyuan Zhang, Zian Su, Pin-Yu Chen, Elisa Bertino, Xiangyu Zhang, Ninghui Li
arxiv.org/abs/2505.24019

@arXiv_quantph_bot@mastoxiv.page
2025-06-03 08:11:33

State Similarity in Modular Superconducting Quantum Processors with Classical Communications
Bujiao Wu, Changrong Xie, Peng Mi, Zhiyi Wu, Zechen Guo, Peisheng Huang, Wenhui Huang, Xuandong Sun, Jiawei Zhang, Libo Zhang, Jiawei Qiu, Xiayu Linpeng, Ziyu Tao, Ji Chu, Ji Jiang, Song Liu, Jingjing Niu, Yuxuan Zhou, Yuxuan Du, Wenhui Ren, Youpeng Zhong, Tongliang Liu, Dapeng Yu

@arXiv_statML_bot@mastoxiv.page
2025-06-02 10:23:08

This arxiv.org/abs/2502.03503 has been replaced.
initial toot: mastoxiv.page/@arXiv_sta…

@arXiv_csSE_bot@mastoxiv.page
2025-06-03 07:34:40

CodeSense: a Real-World Benchmark and Dataset for Code Semantic Reasoning
Monoshi Kumar Roy, Simin Chen, Benjamin Steenhoek, Jinjun Peng, Gail Kaiser, Baishakhi Ray, Wei Le
arxiv.org/abs/2506.00750

@arXiv_csCY_bot@mastoxiv.page
2025-06-03 07:20:20

SafeCOMM: What about Safety Alignment in Fine-Tuned Telecom Large Language Models?
Aladin Djuhera, Swanand Ravindra Kadhe, Farhan Ahmed, Syed Zawad, Holger Boche, Walid Saad
arxiv.org/abs/2506.00062

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 10:15:30

NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks
Yang Li, Youssef Emad, Karthik Padthe, Jack Lanchantin, Weizhe Yuan, Thao Nguyen, Jason Weston, Shang-Wen Li, Dong Wang, Ilia Kulikov, Xian Li
arxiv.org/abs/2507.01921

@arXiv_csHC_bot@mastoxiv.page
2025-07-01 10:14:03

Deep Learning in Mild Cognitive Impairment Diagnosis using Eye Movements and Image Content in Visual Memory Tasks
Tom\'as Silva Santos Rocha, Anastasiia Mikhailova, Moreno I. Coco, Jos\'e Santos-Victor
arxiv.org/abs/2506.23016

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:16:49

Tournament of Prompts: Evolving LLM Instructions Through Structured Debates and Elo Ratings
Anirudh Nair, Adi Banerjee, Laurent Mombaerts, Matthew Hagen, Tarik Borogovac
arxiv.org/abs/2506.00178

@Techmeme@techhub.social
2025-08-01 18:25:51

Source: GPT-5 improvements won't be comparable to the leaps in performance of earlier models, such as between GPT-3 in 2020 and GPT-4 in 2023 (The Information)
theinformation.com/articles/in

@arXiv_csRO_bot@mastoxiv.page
2025-08-01 08:28:21

Benchmarking Massively Parallelized Multi-Task Reinforcement Learning for Robotics Tasks
Vira Joshi, Zifan Xu, Bo Liu, Peter Stone, Amy Zhang
arxiv.org/abs/2507.23172

@arXiv_csCV_bot@mastoxiv.page
2025-07-03 10:30:00

evMLP: An Efficient Event-Driven MLP Architecture for Vision
Zhentan Zheng
arxiv.org/abs/2507.01927 arxiv.org/pdf/250…

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 08:21:58

Transformers as Multi-task Learners: Decoupling Features in Hidden Markov Models
Yifan Hao, Chenlu Ye, Chi Han, Tong Zhang
arxiv.org/abs/2506.01919

@arXiv_csIR_bot@mastoxiv.page
2025-06-30 09:42:30

IRanker: Towards Ranking Foundation Model
Tao Feng, Zhigang Hua, Zijie Lei, Yan Xie, Shuang Yang, Bo Long, Jiaxuan You
arxiv.org/abs/2506.21638

@arXiv_csCR_bot@mastoxiv.page
2025-06-02 10:00:25

This arxiv.org/abs/2411.16746 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCR_…

@arXiv_statML_bot@mastoxiv.page
2025-07-03 08:34:00

When Less Is More: Binary Feedback Can Outperform Ordinal Comparisons in Ranking Recovery
Shirong Xu, Jingnan Zhang, Junhui Wang
arxiv.org/abs/2507.01613

@arXiv_csAI_bot@mastoxiv.page
2025-07-01 09:59:13

AURA: Agent for Understanding, Reasoning, and Automated Tool Use in Voice-Driven Tasks
Leander Melroy Maben, Gayathri Ganesh Lakshmy, Srijith Radhakrishnan, Siddhant Arora, Shinji Watanabe
arxiv.org/abs/2506.23049

@arXiv_csHC_bot@mastoxiv.page
2025-06-30 09:07:10

Building Trustworthy Cognitive Monitoring for Safety-Critical Human Tasks: A Phased Methodological Approach
Maciej Grzeszczuk, Grzegorz Pochwatko, Barbara Karpowicz, Stanis{\l}aw Knapi\'nski, Wies{\l}aw Kope\'c
arxiv.org/abs/2506.22066

@arXiv_csCR_bot@mastoxiv.page
2025-06-03 16:18:49

This arxiv.org/abs/2402.02160 has been replaced.
link: scholar.google.com/scholar?q=a

@arXiv_csSE_bot@mastoxiv.page
2025-06-03 07:37:38

CODEMENV: Benchmarking Large Language Models on Code Migration
Keyuan Cheng, Xudong Shen, Yihao Yang, Tengyue Wang, Yang Cao, Muhammad Asif Ali, Hanbin Wang, Lijie Hu, Di Wang
arxiv.org/abs/2506.00894

@arXiv_statML_bot@mastoxiv.page
2025-07-02 09:12:50

An in depth look at the Procrustes-Wasserstein distance: properties and barycenters
Davide Adamo, Marco Corneli, Manon Vuillien, Emmanuelle Vila
arxiv.org/abs/2507.00894

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:19:46

Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning
Fangyu Lei, Jinxiang Meng, Yiming Huang, Tinghong Chen, Yun Zhang, Shizhu He, Jun Zhao, Kang Liu
arxiv.org/abs/2506.01710

@Techmeme@techhub.social
2025-07-01 06:25:47

An interview with Claude AI product lead Scott White on Claude Code writing 90% of its own code, MCP, coding being accessible to non-technical workers, and more (Michael Nuñez/VentureBeat)
venturebeat.com/ai/from-chatbo

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 08:04:16

Humanoid World Models: Open World Foundation Models for Humanoid Robotics
Muhammad Qasim Ali, Aditya Sridhar, Shahbuland Matiana, Alex Wong, Mohammad Al-Sharman
arxiv.org/abs/2506.01182

@arXiv_csCR_bot@mastoxiv.page
2025-06-02 07:17:07

Spa-VLM: Stealthy Poisoning Attacks on RAG-based VLM
Lei Yu, Yechao Zhang, Ziqi Zhou, Yang Wu, Wei Wan, Minghui Li, Shengshan Hu, Pei Xiaobing, Jing Wang
arxiv.org/abs/2505.23828

@arXiv_csSE_bot@mastoxiv.page
2025-06-02 10:00:22

This arxiv.org/abs/2411.10877 has been replaced.
initial toot: mastoxiv.page/@arXiv_csSE_…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:18:25

Evaluation of LLMs for mathematical problem solving
Ruonan Wang, Runxi Wang, Yunwen Shen, Chengfeng Wu, Qinglin Zhou, Rohitash Chandra
arxiv.org/abs/2506.00309

@arXiv_csHC_bot@mastoxiv.page
2025-08-01 09:32:31

ChatVis: Large Language Model Agent for Generating Scientific Visualizations
Tom Peterka, Tanwi Mallick, Orcun Yildiz, David Lenz, Cory Quammen, Berk Geveci
arxiv.org/abs/2507.23096

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 10:12:00

MiCoTA: Bridging the Learnability Gap with Intermediate CoT and Teacher Assistants
Dongyi Ding, Tiannan Wang, Chenghao Zhu, Meiling Tao, Yuchen Eleanor Jiang, Wangchunshu Zhou
arxiv.org/abs/2507.01887

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 07:23:15

PB&J: Peanut Butter and Joints for Damped Articulation
Avery S. Williamson, Michael J. Bennington, Ravesh Sukhnandan, Mrinali Nakhre, Yuemin Mao, Victoria A. Webster-Wood
arxiv.org/abs/2505.24860

@Techmeme@techhub.social
2025-07-25 14:20:50

Sources: GPT-5 shows improved performance in coding, particularly in practical software engineering tasks, outperforming prior OpenAI models and Claude Sonnet 4 (Stephanie Palazzolo/The Information)
theinformation.com/articles/op

@arXiv_csCR_bot@mastoxiv.page
2025-06-02 09:58:12

This arxiv.org/abs/2410.02644 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCR_…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:16:36

The Automated but Risky Game: Modeling Agent-to-Agent Negotiations and Transactions in Consumer Markets
Shenzhe Zhu, Jiao Sun, Yi Nian, Tobin South, Alex Pentland, Jiaxin Pei
arxiv.org/abs/2506.00073

@arXiv_csCL_bot@mastoxiv.page
2025-07-02 10:14:30

Mixture of Reasonings: Teach Large Language Models to Reason with Adaptive Strategies
Tao Xiong, Xavier Hu, Wenyan Fan, Shengyu Zhang
arxiv.org/abs/2507.00606

@arXiv_csSE_bot@mastoxiv.page
2025-06-03 07:52:09

Exploring Prompt Patterns in AI-Assisted Code Generation: Towards Faster and More Effective Developer-AI Collaboration
Sophia DiCuffa, Amanda Zambrana, Priyanshi Yadav, Sashidhar Madiraju, Khushi Suman, Eman Abdullah AlOmar
arxiv.org/abs/2506.01604

@arXiv_csHC_bot@mastoxiv.page
2025-06-02 07:19:30

Locating Risk: Task Designers and the Challenge of Risk Disclosure in RAI Content Work
Alice Qian Zhang, Ryland Shaw, Laura Dabbish, Jina Suh, Hong Shen
arxiv.org/abs/2505.24246

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 10:20:26

This arxiv.org/abs/2503.22030 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 08:08:56

Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods
Yifan Hao, Xingyuan Pan, Hanning Zhang, Chenlu Ye, Rui Pan, Tong Zhang
arxiv.org/abs/2506.01901

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:19:57

Benford's Curse: Tracing Digit Bias to Numerical Hallucination in LLMs
Jiandong Shao, Yao Lu, Jianfei Yang
arxiv.org/abs/2506.01734

@arXiv_csCR_bot@mastoxiv.page
2025-06-02 07:17:20

Seven Security Challenges That Must be Solved in Cross-domain Multi-agent LLM Systems
Ronny Ko, Jiseong Jeong, Shuyuan Zheng, Chuan Xiao, Taewan Kim, Makoto Onizuka, Wonyong Shin
arxiv.org/abs/2505.23847

@arXiv_csHC_bot@mastoxiv.page
2025-07-03 09:15:40

Human-Machine Collaboration-Guided Space Design: Combination of Machine Learning Models and Humanistic Design Concepts
Yuxuan Yang
arxiv.org/abs/2507.01776

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 08:05:57

LAMARL: LLM-Aided Multi-Agent Reinforcement Learning for Cooperative Policy Generation
Guobin Zhu, Rui Zhou, Wenkang Ji, Shiyu Zhao
arxiv.org/abs/2506.01538

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:20:10

iQUEST: An Iterative Question-Guided Framework for Knowledge Base Question Answering
Shuai Wang, Yinan Yu
arxiv.org/abs/2506.01784

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 08:08:31

Fodor and Pylyshyn's Legacy - Still No Human-like Systematic Compositionality in Neural Networks
Tim Woydt, Moritz Willig, Antonia W\"ust, Lukas Helff, Wolfgang Stammer, Constantin A. Rothkopf, Kristian Kersting
arxiv.org/abs/2506.01820

@arXiv_csCR_bot@mastoxiv.page
2025-06-02 10:03:39

This arxiv.org/abs/2502.11844 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCR_…

@arXiv_csHC_bot@mastoxiv.page
2025-06-02 07:19:05

Enhancing Critical Thinking in Generative AI Search with Metacognitive Prompts
Anjali Singh, Zhitong Guan, Soo Young Rieh
arxiv.org/abs/2505.24014

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 10:00:22

This arxiv.org/abs/2306.07569 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:21:37

RiOSWorld: Benchmarking the Risk of Multimodal Compter-Use Agents
Jingyi Yang, Shuai Shao, Dongrui Liu, Jing Shao
arxiv.org/abs/2506.00618

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:19:48

SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning
Zhongwei Wan, Zhihao Dou, Che Liu, Yu Zhang, Dongfei Cui, Qinjian Zhao, Hui Shen, Jing Xiong, Yi Xin, Yifan Jiang, Yangfan He, Mi Zhang, Shen Yan
arxiv.org/abs/2506.01713

@arXiv_csHC_bot@mastoxiv.page
2025-07-01 11:05:23

Email as the Interface to Generative AI Models: Seamless Administrative Automation
Andres Navarro, Carlos de Quinto, Jos\'e Alberto Hern\'andez
arxiv.org/abs/2506.23850

@arXiv_csCR_bot@mastoxiv.page
2025-07-02 09:16:40

On the Surprising Efficacy of LLMs for Penetration-Testing
Andreas Happe, J\"urgen Cito
arxiv.org/abs/2507.00829

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 07:21:51

Black-box Adversarial Attacks on CNN-based SLAM Algorithms
Maria Rafaela Gkeka, Bowen Sun, Evgenia Smirni, Christos D. Antonopoulos, Spyros Lalis, Nikolaos Bellas
arxiv.org/abs/2505.24654

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 16:37:06

This arxiv.org/abs/2407.15325 has been replaced.
initial toot: mastoxiv.page/@arXiv_csAI_…

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:20:30

Not All Jokes Land: Evaluating Large Language Models Understanding of Workplace Humor
Moahmmadamin Shafiei, Hamidreza Saffari
arxiv.org/abs/2506.01819

@arXiv_csRO_bot@mastoxiv.page
2025-07-03 10:00:40

AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
Sixiang Chen, Jiaming Liu, Siyuan Qian, Han Jiang, Lily Li, Renrui Zhang, Zhuoyang Liu, Chenyang Gu, Chengkai Hou, Pengwei Wang, Zhongyuan Wang, Shanghang Zhang
arxiv.org/abs/2507.01961

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:21:10

DRAG: Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillation
Jennifer Chen, Aidar Myrzakhan, Yaxin Luo, Hassaan Muhammad Khan, Sondos Mahmoud Bsharat, Zhiqiang Shen
arxiv.org/abs/2506.01954

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 16:19:26

This arxiv.org/abs/2406.13948 has been replaced.
initial toot: mastoxiv.page/@arXiv_csAI_…

@arXiv_csRO_bot@mastoxiv.page
2025-07-02 10:15:50

Robotic Manipulation by Imitating Generated Videos Without Physical Demonstrations
Shivansh Patel, Shraddhaa Mohan, Hanlin Mai, Unnat Jain, Svetlana Lazebnik, Yunzhu Li
arxiv.org/abs/2507.00990

@arXiv_csCL_bot@mastoxiv.page
2025-07-02 09:56:30

Failure by Interference: Language Models Make Balanced Parentheses Errors When Faulty Mechanisms Overshadow Sound Ones
Daking Rai, Samuel Miller, Kevin Moran, Ziyu Yao
arxiv.org/abs/2507.00322

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:23:19

DrKGC: Dynamic Subgraph Retrieval-Augmented LLMs for Knowledge Graph Completion across General and Biomedical Domains
Yongkang Xiao, Sinian Zhang, Yi Dai, Huixue Zhou, Jue Hou, Jie Ding, Rui Zhang
arxiv.org/abs/2506.00708

@arXiv_csRO_bot@mastoxiv.page
2025-07-02 10:17:00

DexWrist: A Robotic Wrist for Constrained and Dynamic Manipulation
Martin Peticco, Gabriella Ulloa, John Marangola, Pulkit Agrawal
arxiv.org/abs/2507.01008

@arXiv_csCL_bot@mastoxiv.page
2025-07-02 10:00:20

Question Decomposition for Retrieval-Augmented Generation
Paul J. L. Ammann, Jonas Golde, Alan Akbik
arxiv.org/abs/2507.00355

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:17:04

What do professional software developers need to know to succeed in an age of Artificial Intelligence?
Matthew Kam, Cody Miller, Miaoxin Wang, Abey Tidwell, Irene A. Lee, Joyce Malyn-Smith, Beatriz Perez, Vikram Tiwari, Joshua Kenitzer, Andrew Macvean, Erin Barrar
arxiv.org/abs/2506.00202

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 10:02:30

Stereotype Detection as a Catalyst for Enhanced Bias Detection: A Multi-Task Learning Approach
Aditya Tomar, Rudra Murthy, Pushpak Bhattacharyya
arxiv.org/abs/2507.01715

@arXiv_csAI_bot@mastoxiv.page
2025-07-01 10:22:03

Are Large Language Models Capable of Deep Relational Reasoning? Insights from DeepSeek-R1 and Benchmark Comparisons
Chi Chiu So, Yueyue Sun, Jun-Min Wang, Siu Pang Yung, Anthony Wai Keung Loh, Chun Pong Chau
arxiv.org/abs/2506.23128

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 07:23:02

DexMachina: Functional Retargeting for Bimanual Dexterous Manipulation
Zhao Mandi, Yifan Hou, Dieter Fox, Yashraj Narang, Ajay Mandlekar, Shuran Song
arxiv.org/abs/2505.24853

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:20:21

NAVER LABS Europe Submission to the Instruction-following Track
Beomseok Lee, Marcely Zanon Boito, Laurent Besacier, Ioan Calapodescu
arxiv.org/abs/2506.01808

@arXiv_csAI_bot@mastoxiv.page
2025-07-01 11:25:33

Pok\'eAI: A Goal-Generating, Battle-Optimizing Multi-agent System for Pokemon Red
Zihao Liu, Xinhang Sui, Yueran Song, Siwen Wang
arxiv.org/abs/2506.23689

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 10:04:33

This arxiv.org/abs/2410.00425 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:18:33

Dyna-Think: Synergizing Reasoning, Acting, and World Model Simulation in AI Agents
Xiao Yu, Baolin Peng, Ruize Xu, Michel Galley, Hao Cheng, Suman Nath, Jianfeng Gao, Zhou Yu
arxiv.org/abs/2506.00320

@arXiv_csRO_bot@mastoxiv.page
2025-07-02 10:07:50

HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning
Zhi Jing, Siyuan Yang, Jicong Ao, Ting Xiao, Yugang Jiang, Chenjia Bai
arxiv.org/abs/2507.00833

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:16:56

Control-R: Towards controllable test-time scaling
Di Zhang, Weida Wang, Junxian Li, Xunzhi Wang, Jiatong Li, Jianbo Wu, Jingdi Lei, Haonan He, Peng Ye, Shufei Zhang, Wanli Ouyang, Yuqiang Li, Dongzhan Zhou
arxiv.org/abs/2506.00189

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 10:24:33

This arxiv.org/abs/2505.12583 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 10:17:34

This arxiv.org/abs/2502.20817 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 08:04:34

HoMeR: Learning In-the-Wild Mobile Manipulation via Hybrid Imitation and Whole-Body Control
Priya Sundaresan, Rhea Malhotra, Phillip Miao, Jingyun Yang, Jimmy Wu, Hengyuan Hu, Rika Antonova, Francis Engelmann, Dorsa Sadigh, Jeannette Bohg
arxiv.org/abs/2506.01185

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 10:28:24

This arxiv.org/abs/2505.21906 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csRO_bot@mastoxiv.page
2025-07-02 09:57:40

Parallel Transmission Aware Co-Design: Enhancing Manipulator Performance Through Actuation-Space Optimization
Rohit Kumar, Melya Boukheddimi, Dennis Mronga, Shivesh Kumar, Frank Kirchner
arxiv.org/abs/2507.00644

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 08:04:56

OG-VLA: 3D-Aware Vision Language Action Model via Orthographic Image Generation
Ishika Singh, Ankit Goyal, Stan Birchfield, Dieter Fox, Animesh Garg, Valts Blukis
arxiv.org/abs/2506.01196

@arXiv_csRO_bot@mastoxiv.page
2025-07-02 09:18:00

RoboEval: Where Robotic Manipulation Meets Structured and Scalable Evaluation
Yi Ru Wang, Carter Ung, Grant Tannert, Jiafei Duan, Josephine Li, Amy Le, Rishabh Oswal, Markus Grotz, Wilbert Pumacay, Yuquan Deng, Ranjay Krishna, Dieter Fox, Siddhartha Srinivasa
arxiv.org/abs/2507.00435