Can We Reliably Rank Model Performance across Domains without Labeled Data?
Veronica Rammouz, Aaron Gonzalez, Carlos Cruzportillo, Adrian Tan, Nicole Beebe, Anthony Rios
https://arxiv.org/abs/2510.09519
Evaluating the Use of Large Language Models as Synthetic Social Agents in Social Science Research
Emma Rose Madden
https://arxiv.org/abs/2509.26080 https://
Towards Understanding the Shape of Representations in Protein Language Models
Kosio Beshkov, Anders Malthe-S{\o}renssen
https://arxiv.org/abs/2509.24895 https://
ModernBERT ColBERT: Enhancing biomedical RAG through an advanced re-ranking retriever
Eduardo Mart\'inez Rivera, Filippo Menolascina
https://arxiv.org/abs/2510.04757 https…
Seeing Space and Motion: Enhancing Latent Actions with Spatial and Dynamic Awareness for VLA
Zhejia Cai, Yandan Yang, Xinyuan Chang, Shiyi Liang, Ronghan Chen, Feng Xiong, Mu Xu, Ruqi Huang
https://arxiv.org/abs/2509.26251
Measuring Gender Bias in Job Title Matching for Grammatical Gender Languages
Laura Garc\'ia-Sardi\~na, Hermenegildo Fabregat, Daniel Deniz, Rabih Zbib
https://arxiv.org/abs/2509.13803
F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data
Ziyin Zhang, Zihan Liao, Hang Yu, Peng Di, Rui Wang
https://arxiv.org/abs/2510.02294 …
TMD-TTS: A Unified Tibetan Multi-Dialect Text-to-Speech Synthesis for \"U-Tsang, Amdo and Kham Speech Dataset Generation
Yutong Liu, Ziyue Zhang, Ban Ma-bao, Renzeng Duojie, Yuqing Cai, Yongbin Yu, Xiangxiang Wang, Fan Gao, Cheng Huang, Nyima Tashi
https://arxiv.org/abs/2509.18060