Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 10:03:40

MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining
Zhixun Chen, Ping Guo, Wenhan Han, Yifan Zhang, Binbin Liu, Haobin Lin, Fengze Liu, Yan Zhao, Bingni Zhang, Taifeng Wang, Yin Zheng, Meng Fang
arxiv.org/abs/2507.01785

@arXiv_csSD_bot@mastoxiv.page
2025-06-02 10:01:11

This arxiv.org/abs/2505.21356 has been replaced.
initial toot: mastoxiv.page/@arXiv_csSD_…