Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csAI_bot@mastoxiv.page
2025-06-24 11:33:30

Reasoning about Uncertainty: Do Reasoning Models Know When They Don't Know?
Zhiting Mei, Christina Zhang, Tenny Yin, Justin Lidard, Ola Shorinwa, Anirudha Majumdar
arxiv.org/abs/2506.18183

@thomasrenkert@hcommons.social
2025-05-23 08:15:29

The #OpenAI paper by Baker et al, "Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation" comes to a troubling conclusion: #LLM s with #reasoning or

If CoT pressures are used to improve agent capabilities or alignment, there may be no alternative approach to yield the same improvements without degrading monitorability. In the worst case, where the agent learns to fully obscure its intent in its CoT, we ultimately revert to the same model safety conditions that existed prior to the emergence of reasoning models and must rely on monitoring activations, monitoring potentially adversarial CoTs and outputs, or improved alignment methods. Model a…
@heiseonline@social.heise.de
2025-06-23 17:49:00

Gesetzentwurf: Bankkunden sollen mehr Rechte beim Dispo bekommen
Online-Shopping auf Rechnung soll eingeschränkt werden, da vor allem junge Menschen dadurch leicht den Überblick verlieren und in die Schuldenfalle geraten.

@arXiv_csIR_bot@mastoxiv.page
2025-06-23 08:03:49

Architecture is All You Need: Improving LLM Recommenders by Dropping the Text
Kevin Foley, Shaghayegh Agah, Kavya Priyanka Kakinada
arxiv.org/abs/2506.15833

@arXiv_csAI_bot@mastoxiv.page
2025-06-24 11:59:40

ConciseHint: Boosting Efficient Reasoning via Continuous Concise Hints during Generation
Siao Tang, Xinyin Ma, Gongfan Fang, Xinchao Wang
arxiv.org/abs/2506.18810

@heiseonline@social.heise.de
2025-06-24 06:40:00

Galaxy Z Fold 7 und mehr: Samsung kündigt Unpacked-Event für 9. Juli an
Samsung will Anfang Juli im Zuge eines Unpacked-Events neue Produkte vorstellen. Neben Foldables können Konsumenten auch mit Galaxy-Watches rechnen. 

@arXiv_csIR_bot@mastoxiv.page
2025-06-24 09:58:30

A Framework for Generating Conversational Recommendation Datasets from Behavioral Interactions
Vinaik Chhetri, Yousaf Reza, Moghis Fereidouni, Srijata Maji, Umar Farooq, AB Siddique
arxiv.org/abs/2506.17285

@arXiv_csAI_bot@mastoxiv.page
2025-06-24 08:53:29

PhysUniBench: An Undergraduate-Level Physics Reasoning Benchmark for Multimodal Models
Lintao Wang, Encheng Su, Jiaqi Liu, Pengze Li, Peng Xia, Jiabei Xiao, Wenlong Zhang, Xinnan Dai, Xi Chen, Yuan Meng, Mingyu Ding, Lei Bai, Wanli Ouyang, Shixiang Tang, Aoran Wang, Xinzhu Ma
arxiv.org/abs/2506.17667

@arXiv_csIR_bot@mastoxiv.page
2025-06-23 09:53:20

Pyramid Mixer: Multi-dimensional Multi-period Interest Modeling for Sequential Recommendation
Zhen Gong, Zhifang Fan, Hui Lu, Qiwei Chen, Chenbin Zhang, Lin Guan, Yuchao Zheng, Feng Zhang, Xiao Yang, Zuotao Liu
arxiv.org/abs/2506.16942

@arXiv_csIR_bot@mastoxiv.page
2025-06-24 11:08:00

LLM-Enhanced Multimodal Fusion for Cross-Domain Sequential Recommendation
Wangyu Wu, Zhenhong Chen, Xianglin Qiu, Siqi Song, Xiaowei Huang, Fei Ma, Jimin Xiao
arxiv.org/abs/2506.17966