Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csRO_bot@mastoxiv.page
2025-06-16 08:40:39

mimic-one: a Scalable Model Recipe for General Purpose Robot Dexterity
Elvis Nava, Victoriano Montesinos, Erik Bauer, Benedek Forrai, Jonas Pai, Stefan Weirich, Stephan-Daniel Gravert, Philipp Wand, Stephan Polinski, Benjamin F. Grewe, Robert K. Katzschmann
arxiv.org/abs/2506.11916

@arXiv_csCV_bot@mastoxiv.page
2025-07-10 08:05:31

Centralized Copy-Paste: Enhanced Data Augmentation Strategy for Wildland Fire Semantic Segmentation
Joon Tai Kim, Tianle Chen, Ziyu Dong, Nishanth Kunchala, Alexander Guller, Daniel Ospina Acero, Roger Williams, Mrinal Kumar
arxiv.org/abs/2507.06321

@arXiv_eessIV_bot@mastoxiv.page
2025-06-16 08:47:59

crossMoDA Challenge: Evolution of Cross-Modality Domain Adaptation Techniques for Vestibular Schwannoma and Cochlea Segmentation from 2021 to 2023
Navodini Wijethilake, Reuben Dorent, Marina Ivory, Aaron Kujawa, Stefan Cornelissen, Patrick Langenhuizen, Mohamed Okasha, Anna Oviedova, Hexin Dong, Bogyeong Kang, Guillaume Sall\'e, Luyi Han, Ziyuan Zhao, Han Liu, Tao Yang, Shahad Hardan, Hussain Alasmawi, Santosh Sanjeev, Yuzhou Zhuang, Satoshi Kondo, Maria Baldeon Calisto, Shaikh Muh…

@georgiamuseum@glammr.us
2025-06-11 14:35:37

Last year, we bought a collection of 17 Georgia paintings by 19th- and 20th-century artists, many of whom are lesser known. Even the ones who are better known, like #NellChoateJones, aren't exactly _well_ known. We're excited to start studying these works and learning about the Georgia scenes many of them show.

Nell Choate Jones' painting "Square at St. Mary's," a color scene that shows a Black family in a southern square, next to a big tree. They could be in front of a church. Most of them are wearing white.
Augusta Oelschig's painting of a young Black boy in a gold frame. She shows him at bust length, looking slightly to his left. He wears a pale blue shirt with an open collar.
A watercolor painting of Savannah by Hattie Saussy. Seen from a park it shows several four- or five-story brick buildings across the way, partially blocked by greenery. The image is soft and pastel, more abstract in the foreground and more precise in the background.
@arXiv_csSI_bot@mastoxiv.page
2025-07-14 08:08:12

Machine Learning for Evolutionary Graph Theory
Guoli Yang, Matteo Cavaliere, Mingtao Zhang, Giovanni Masala, Adam Miles, Mengzhu Wang
arxiv.org/abs/2507.08363

@arXiv_csIR_bot@mastoxiv.page
2025-06-10 08:09:42

Correcting for Position Bias in Learning to Rank: A Control Function Approach
Md Aminul Islam, Kathryn Vasilaky, Elena Zheleva
arxiv.org/abs/2506.06989

@arXiv_csLG_bot@mastoxiv.page
2025-07-14 08:15:52

Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions
Simon Matrenok, Skander Moalla, Caglar Gulcehre
arxiv.org/abs/2507.08068 arxiv.org/pdf/2507.08068 arxiv.org/html/2507.08068
arXiv:2507.08068v1 Announce Type: new
Abstract: Aligning large language models with pointwise absolute rewards has so far required online, on-policy algorithms such as PPO and GRPO. In contrast, simpler methods that can leverage offline or off-policy data, such as DPO and REBEL, are limited to learning from preference pairs or relative signals. To bridge this gap, we introduce \emph{Quantile Reward Policy Optimization} (QRPO), which learns from pointwise absolute rewards while preserving the simplicity and offline applicability of DPO-like methods. QRPO uses quantile rewards to enable regression to the closed-form solution of the KL-regularized RL objective. This reward yields an analytically tractable partition function, removing the need for relative signals to cancel this term. Moreover, QRPO scales with increased compute to estimate quantile rewards, opening a new dimension for pre-computation scaling. Empirically, QRPO consistently achieves top performance on chat and coding evaluations -- reward model scores, AlpacaEval 2, and LeetCode -- compared to DPO, REBEL, and SimPO across diverse datasets and 8B-scale models. Finally, we find that training with robust rewards instead of converting them to preferences induces less length bias.
toXiv_bot_toot

@arXiv_csRO_bot@mastoxiv.page
2025-07-09 10:02:12

Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model
Koki Yamane, Yunhan Li, Masashi Konosu, Koki Inami, Junji Oaki, Sho Sakaino, Toshiaki Tsuji
arxiv.org/abs/2507.06174

@arXiv_csCV_bot@mastoxiv.page
2025-07-08 14:33:51

Physics-Guided Dual Implicit Neural Representations for Source Separation
Yuan Ni, Zhantao Chen, Alexander N. Petsch, Edmund Xu, Cheng Peng, Alexander I. Kolesnikov, Sugata Chowdhury, Arun Bansil, Jana B. Thayer, Joshua J. Turner
arxiv.org/abs/2507.05249

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 08:21:44

Learning to Explore: An In-Context Learning Approach for Pure Exploration
Alessio Russo, Ryan Welch, Aldo Pacchiano
arxiv.org/abs/2506.01876