
2025-08-15 10:15:52
Continuous Bangla Sign Language Translation: Mitigating the Expense of Gloss Annotation with the Assistance of Graph
Safaeid Hossain Arib, Rabeya Akter, Sejuti Rahman
https://arxiv.org/abs/2508.10687
Continuous Bangla Sign Language Translation: Mitigating the Expense of Gloss Annotation with the Assistance of Graph
Safaeid Hossain Arib, Rabeya Akter, Sejuti Rahman
https://arxiv.org/abs/2508.10687
GraphRAG-Causal: A novel graph-augmented framework for causal reasoning and annotation in news
Abdul Haque, Umm e Hani, Ahmad Din, Muhammad Babar, Ali Abbas, Insaf Ullah
https://arxiv.org/abs/2506.11600
Visual Prompting for Robotic Manipulation with Annotation-Guided Pick-and-Place Using ACT
Muhammad A. Muttaqien, Tomohiro Motoda, Ryo Hanai, Yukiyasu Domae
https://arxiv.org/abs/2508.08748
Motive-level Analysis of Form-functions Association in Korean Folk song
Danbinaerin Han, Dasaem Jeong, Juhan Nam
https://arxiv.org/abs/2508.10472 https://a…
When Deepfakes Look Real: Detecting AI-Generated Faces with Unlabeled Data due to Annotation Challenges
Zhiqiang Yang, Renshuai Tao, Xiaolong Zheng, Guodong Yang, Chunjie Zhang
https://arxiv.org/abs/2508.09022
Wisdom of the Crowd, Without the Crowd: A Socratic LLM for Asynchronous Deliberation on Perspectivist Data
Malik Khadar, Daniel Runningen, Julia Tang, Stevie Chancellor, Harmanpreet Kaur
https://arxiv.org/abs/2508.09911
Exploring the Potential of Large Language Models in Fine-Grained Review Comment Classification
Linh Nguyen, Chunhua Liu, Hong Yi Lin, Patanamon Thongtanunam
https://arxiv.org/abs/2508.09832
Advancing Data Equity: Practitioner Responsibility and Accountability in NLP Data Practices
Jay L. Cunningham, Kevin Zhongyang Shao, Rock Yuren Pang, Nathaniel Mengist
https://arxiv.org/abs/2508.10071 …
LLMLog: Advanced Log Template Generation via LLM-driven Multi-Round Annotation
Fei Teng, Haoyang Li, Lei Chen
https://arxiv.org/abs/2508.09594 https://arxi…
Data-Efficient Learning for Generalizable Surgical Video Understanding
Sahar Nasirihaghighi
https://arxiv.org/abs/2508.10215 https://arxiv.org/pdf/2508.102…
DatasetAgent: A Novel Multi-Agent System for Auto-Constructing Datasets from Real-World Images
Haoran Sun, Haoyu Bian, Shaoning Zeng, Yunbo Rao, Xu Xu, Lin Mei, Jianping Gou
https://arxiv.org/abs/2507.08648
This https://arxiv.org/abs/2411.14464 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_qbi…
Can LLM-Generated Textual Explanations Enhance Model Classification Performance? An Empirical Study
Mahdi Dhaini, Juraj Vladika, Ege Erdogan, Zineb Attaoui, Gjergji Kasneci
https://arxiv.org/abs/2508.09776
Probably Approximately Correct Labels
Emmanuel J. Cand\`es, Andrew Ilyas, Tijana Zrnic
https://arxiv.org/abs/2506.10908 https://arxiv…
AnnoDPO: Protein Functional Annotation Learning with Direct Preference Optimization
Zixuan Jiang, Renjing Xu
https://arxiv.org/abs/2506.07035 https://
This https://arxiv.org/abs/2502.07404 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csHC_…
STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation
Wenxiang Guo, Yu Zhang, Changhao Pan, Zhiyuan Zhu, Ruiqi Li, Zhetao Chen, Wenhao Xu, Fei Wu, Zhou Zhao
https://arxiv.org/abs/2507.06670
Towards Scalable Training for Handwritten Mathematical Expression Recognition
Haoyang Li, Jiaqing Li, Jialun Cao, Zongyuan Yang, Yongping Xiong
https://arxiv.org/abs/2508.09220 …
ProteoKnight: Convolution-based phage virion protein classification and uncertainty analysis
Samiha Afaf Neha, Abir Ahammed Bhuiyan, Md. Ishrak Khan
https://arxiv.org/abs/2508.07345
Jailbreaking Commercial Black-Box LLMs with Explicitly Harmful Prompts
Chiyu Zhang, Lu Zhou, Xiaogang Xu, Jiafei Wu, Liming Fang, Zhe Liu
https://arxiv.org/abs/2508.10390 https:…
This https://arxiv.org/abs/2502.18744 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
Automated Type Annotation in Python Using Large Language Models
Varun Bharti, Shashwat Jha, Dhruv Kumar, Pankaj Jalote
https://arxiv.org/abs/2508.00422 https://
EQ-TAA: Equivariant Traffic Accident Anticipation via Diffusion-Based Accident Video Synthesis
Jianwu Fang, Lei-Lei Li, Zhedong Zheng, Hongkai Yu, Jianru Xue, Zhengguo Li, Tat-Seng Chua
https://arxiv.org/abs/2506.10002
miRKatAI: An Integrated Database and Multi-agent AI system for microRNA Research
Karen Guerrero-Vazquez, Jacopo Umberto Verga, Pilib O Broin, Katarzyna Goljanek-Whysall
https://arxiv.org/abs/2508.08331
Event-Aware Sentiment Factors from LLM-Augmented Financial Tweets: A Transparent Framework for Interpretable Quant Trading
Yueyi Wang, Qiyao Wei
https://arxiv.org/abs/2508.07408
scSSL-Bench: Benchmarking Self-Supervised Learning for Single-Cell Data
Olga Ovcharenko, Florian Barkmann, Philip Toma, Imant Daunhawer, Julia Vogt, Sebastian Schelter, Valentina Boeva
https://arxiv.org/abs/2506.10031
The Cell Ontology in the age of single-cell omics
Shawn Zheng Kai Tan, Aleix Puig-Barbe, Damien Goutte-Gattat, Caroline Eastwood, Brian Aevermann, Alida Avola, James P Balhoff, Ismail Ugur Bayindir, Jasmine Belfiore, Anita Reane Caron, David S Fischer, Nancy George, Benjamin M Gyori, Melissa A Haendel, Charles Tapley Hoyt, Huseyin Kir, Tiago Lubiana, Nicolas Matentzoglu, James A Overton, Beverly Peng, Bjoern Peters, Ellen M Quardokus, Patrick L Ray, Paola Roncaglia, Andrea D Rivera, Ra…
From Coarse to Fine-Grained Emotion Annotation: An Immediate Recall Paradigm with Validation through Physiological Evidence and Recognition Performance
Hao Tang, Songyun Xie, Xinzhou Xie, Can Liao, Xin Zhang, Bohan Li, Zhongyu Tian, Dalu Zheng
https://arxiv.org/abs/2507.02350
An Enhanced Privacy-preserving Federated Few-shot Learning Framework for Respiratory Disease Diagnosis
Ming Wang, Zhaoyang Duan, Dong Xue, Fangzhou Liu, Zhongheng Zhang
https://arxiv.org/abs/2507.08050 https://arxiv.org/pdf/2507.08050 https://arxiv.org/html/2507.08050
arXiv:2507.08050v1 Announce Type: new
Abstract: The labor-intensive nature of medical data annotation presents a significant challenge for respiratory disease diagnosis, resulting in a scarcity of high-quality labeled datasets in resource-constrained settings. Moreover, patient privacy concerns complicate the direct sharing of local medical data across institutions, and existing centralized data-driven approaches, which rely on amounts of available data, often compromise data privacy. This study proposes a federated few-shot learning framework with privacy-preserving mechanisms to address the issues of limited labeled data and privacy protection in diagnosing respiratory diseases. In particular, a meta-stochastic gradient descent algorithm is proposed to mitigate the overfitting problem that arises from insufficient data when employing traditional gradient descent methods for neural network training. Furthermore, to ensure data privacy against gradient leakage, differential privacy noise from a standard Gaussian distribution is integrated into the gradients during the training of private models with local data, thereby preventing the reconstruction of medical images. Given the impracticality of centralizing respiratory disease data dispersed across various medical institutions, a weighted average algorithm is employed to aggregate local diagnostic models from different clients, enhancing the adaptability of a model across diverse scenarios. Experimental results show that the proposed method yields compelling results with the implementation of differential privacy, while effectively diagnosing respiratory diseases using data from different structures, categories, and distributions.
toXiv_bot_toot
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[1/5]:
- Efficient Annotation of Medieval Charters
Anguelos Nicolaou, Daniel Luger, Franziska Decker, Nicolas Renet, Vincent Christlein, Georg Vogeler
I feel a while ago I saw website here where you could enter an #openstreetmap building ID and it would make a 3D render based on the 3D annotation. For the life of me I can't find the URL, does anyone have an idea?
@… I think you might have shared it at some point with the bandstand of Greenwich Park 😄
[EDIT: Answered thanks to @… , see options here https://floss.social/@hbond/114965004674098103 ]
Latent Motion Profiling for Annotation-free Cardiac Phase Detection in Adult and Fetal Echocardiography Videos
Yingyu Yang, Qianye Yang, Kangning Cui, Can Peng, Elena D'Alberti, Netzahualcoyotl Hernandez-Cruz, Olga Patey, Aris T. Papageorghiou, J. Alison Noble
https://arxiv.org/abs/2507.05154…
Scalable Controllable Accented TTS
Henry Li Xinyuan, Zexin Cai, Ashi Garg, Kevin Duh, Leibny Paola Garc\'ia-Perera, Sanjeev Khudanpur, Nicholas Andrews, Matthew Wiesner
https://arxiv.org/abs/2508.07426
Function-based Labels for Complementary Recommendation: Definition, Annotation, and LLM-as-a-Judge
Chihiro Yamasaki, Kai Sugahara, Yuma Nagi, Kazushi Okamoto
https://arxiv.org/abs/2507.03945
A Unified Empirical Risk Minimization Framework for Flexible N-Tuples Weak Supervision
Shuying Huang, Junpeng Li, Changchun Hua, Yana Yang
https://arxiv.org/abs/2507.07771
I've just noticed that my railway pass no longer carries an explicit "valid (…) in any class" annotation. It still has an "X" in the "class" field, but that's not that cool.
(Explanation: many years ago, there used to be a few "RegioExpress" trains with first class seats. Of course, barely anybody cared about that, and I doubt people actually bought first class tickets. Alas, these are long gone.)
#rail
FineBadminton: A Multi-Level Dataset for Fine-Grained Badminton Video Understanding
Xusheng He, Wei Liu, Shanshan Ma, Qian Liu, Chenghao Ma, Jianlong Wu
https://arxiv.org/abs/2508.07554
{annotater}: Annotate package load calls, so we can have an idea of the overall purpose of the libraries we’re loading: #rstats
From Block to Byte: Transforming PCIe SSDs with CXL Memory Protocol and Instruction Annotation
Miryeong Kwon, Donghyun Gouk, Junhyeok Jang, Jinwoo Baek, Hyunwoo You, Sangyoon Ji, Hongjoo Jung, Junseok Moon, Seungkwan Kang, Seungjun Lee, Myoungsoo Jung
https://arxiv.org/abs/2506.15613
Just Put a Human in the Loop? Investigating LLM-Assisted Annotation for Subjective Tasks
Hope Schroeder, Deb Roy, Jad Kabbara
https://arxiv.org/abs/2507.15821
Co-DETECT: Collaborative Discovery of Edge Cases in Text Classification
Chenfei Xiong, Jingwei Ni, Yu Fan, Vil\'em Zouhar, Donya Rooein, Lorena Calvo-Bartolom\'e, Alexander Hoyle, Zhijing Jin, Mrinmaya Sachan, Markus Leippold, Dirk Hovy, Mennatallah El-Assady, Elliott Ash
https://arxiv.org/abs/2507.05010
MedCAL-Bench: A Comprehensive Benchmark on Cold-Start Active Learning with Foundation Models for Medical Image Analysis
Ning Zhu, Xiaochuan Ma, Shaoting Zhang, Guotai Wang
https://arxiv.org/abs/2508.03441
📚 Developed by #AnthropicFellows with #DecodeResearch collaboration
🛠️ Supports circuit tracing, visualization, annotation and sharing capabilities
Author Once, Publish Everywhere: Portable Metadata Authoring with the CEDAR Embeddable Editor
Martin J. O'Connor, Marcos Martinez-Romero, Attila L. Egyedi, Mete U. Akdogan, Michael V. Dorf, Mark A. Musen
https://arxiv.org/abs/2508.00859
REFS: Robust EEG feature selection with missing multi-dimensional annotation for emotion recognition
Xueyuan Xu, Wenjia Dong, Fulin Wei, Li Zhuo
https://arxiv.org/abs/2508.05933
ProCaliper: functional and structural analysis, visualization, and annotation of proteins
Jordan C. Rozum, Hunter Ufford, Alexandria K. Im, Tong Zhang, David D. Pollock, Doo Nam Kim, Song Feng
https://arxiv.org/abs/2506.19961
This https://arxiv.org/abs/2502.16517 has been replaced.
link: https://scholar.google.com/scholar?q=a
Airway Segmentation Network for Enhanced Tubular Feature Extraction
Qibiao Wu, Yagang Wang, Qian Zhang
https://arxiv.org/abs/2507.06581 https://
Reliable Annotations with Less Effort: Evaluating LLM-Human Collaboration in Search Clarifications
Leila Tavakoli, Hamed Zamani
https://arxiv.org/abs/2507.00543
@… Some language servers do this. The Haskell Language Server, for example, will let you add an explicit type annotation for an implicitly typed variable or function. I think I’ve seen this for TypeScript too.
I’m not sure if this answers your question, but I hope it does.
Leveraging Caliper and Benchpark to Analyze MPI Communication Patterns: Insights from AMG2023, Kripke, and Laghos
Grace Nansamba, Evelyn Namugwanya, David Boehme, Dewi Yokelson, Riley Shipley, Derek Schafer, Michael McKinsey, Olga Pearce, Anthony Skjellum
https://arxiv.org/abs/2507.22372
Grapheme-Coherent Phonemic and Prosodic Annotation of Speech by Implicit and Explicit Grapheme Conditioning
Hien Ohnaka, Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto
https://arxiv.org/abs/2506.04527
ChartMark: A Structured Grammar for Chart Annotation
Yiyu Chen, Yifan Wu, Shuyu Shen, Yupeng Xie, Leixian Shen, Hui Xiong, Yuyu Luo
https://arxiv.org/abs/2507.21810 https://
RightTyper: Effective and Efficient Type Annotation for Python
Juan Altmayer Pizzorno, Emery D. Berger
https://arxiv.org/abs/2507.16051 https://
Frequency Prior Guided Matching: A Data Augmentation Approach for Generalizable Semi-Supervised Polyp Segmentation
Haoran Xi, Chen Liu, Xiaolin Li
https://arxiv.org/abs/2508.06517
FORGE: An LLM-driven Framework for Large-Scale Smart Contract Vulnerability Dataset Construction
Jiachi Chen, Yiming Shen, Jiashuo Zhang, Zihao Li, John Grundy, Zhenzhe Shao, Yanlin Wang, Jiashui Wang, Ting Chen, Zibin Zheng
https://arxiv.org/abs/2506.18795
This https://arxiv.org/abs/2505.13556 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
This https://arxiv.org/abs/2308.03734 has been replaced.
link: https://scholar.google.com/scholar?q=a
ChemActor: Enhancing Automated Extraction of Chemical Synthesis Actions with LLM-Generated Data
Yu Zhang, Ruijie Yu, Jidong Tian, Feng Zhu, Jiapeng Liu, Xiaokang Yang, Yaohui Jin, Yanyan Xu
https://arxiv.org/abs/2506.23520
SENSOR: An ML-Enhanced Online Annotation Tool to Uncover Privacy Concerns from User Reviews in Social-Media Applications
Labiba Farah, Mohammad Ridwan Kabir, Shohel Ahmed, MD Mohaymen Ul Anam, Md. Sakibul Islam
https://arxiv.org/abs/2507.10640
Hybrid Annotation for Propaganda Detection: Integrating LLM Pre-Annotations with Human Intelligence
Ariana Sahitaj, Premtim Sahitaj, Veronika Solopova, Jiaao Li, Sebastian M\"oller, Vera Schmitt
https://arxiv.org/abs/2507.18343
Autoadaptive Medical Segment Anything Model
Tyler Ward, Meredith K. Owen, O'Kira Coleman, Brian Noehren, Abdullah-Al-Zubaer Imran
https://arxiv.org/abs/2507.01828
GRIT: Graph-Regularized Logit Refinement for Zero-shot Cell Type Annotation
Tianxiang Hu, Chenyi Zhou, Jiaxiang Liu, Jiongxin Wang, Ruizhe Chen, Haoxiang Xia, Gaoang Wang, Jian Wu, Zuozhu Liu
https://arxiv.org/abs/2508.04747
Learning Deliberately, Acting Intuitively: Unlocking Test-Time Reasoning in Multimodal LLMs
Yahan Yu, Yuyang Dong, Masafumi Oyamada
https://arxiv.org/abs/2507.06999
Human-AI Alignment of Multimodal Large Language Models with Speech-Language Pathologists in Parent-Child Interactions
Weiyan Shi, Kenny Tsu Wei Choo
https://arxiv.org/abs/2506.05879
Spegion: Implicit and Non-Lexical Regions with Sized Allocations
Jack Hughes, Michael Vollmer, Mark Batty
https://arxiv.org/abs/2506.02182 https://
ADPv2: A Hierarchical Histological Tissue Type-Annotated Dataset for Potential Biomarker Discovery of Colorectal Disease
Zhiyuan Yang, Kai Li, Sophia Ghamoshi Ramandi, Patricia Brassard, Hakim Khellaf, Vincent Quoc-Huy Trinh, Jennifer Zhang, Lina Chen, Corwyn Rowsell, Sonal Varma, Kostas Plataniotis, Mahdi S. Hosseini
https://ar…
This https://arxiv.org/abs/2506.06155 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…
Melodic and Metrical Elements of Expressiveness in Hindustani Vocal Music
Yash Bhake, Ankit Anand, Preeti Rao
https://arxiv.org/abs/2508.04430 https://arxi…
This https://arxiv.org/abs/2405.17492 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSE_…
Discrepancy-Aware Contrastive Adaptation in Medical Time Series Analysis
Yifan Wang, Hongfeng Ai, Ruiqi Li, Maowei Jiang, Ruiyuan Kang, Jiahua Dong, Cheng Jiang, Chenzhong Li
https://arxiv.org/abs/2508.05572
A Highly Clean Recipe Dataset with Ingredient States Annotation for State Probing Task
Mashiro Toyooka, Kiyoharu Aizawa, Yoko Yamakata
https://arxiv.org/abs/2507.17232 https://
R1-RE: Cross-Domain Relationship Extraction with RLVR
Runpeng Dai, Tong Zheng, Run Yang, Hongtu Zhu
https://arxiv.org/abs/2507.04642 https://
Sequential Attention-based Sampling for Histopathological Analysis
Tarun G, Naman Malpani, Gugan Thoppe, Sridharan Devarajan
https://arxiv.org/abs/2507.05077
AnnoGram: An Annotative Grammar of Graphics Extension
Md Dilshadur Rahman, Md Rahat-uz- Zaman, Andrew McNutt, Paul Rosen
https://arxiv.org/abs/2507.04236 h…
StepAL: Step-aware Active Learning for Cataract Surgical Videos
Nisarg A. Shah, Bardia Safaei, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel
https://arxiv.org/abs/2507.22059
This https://arxiv.org/abs/2407.17490 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csHC_…
LLM-Based Repair of Static Nullability Errors
Nima Karimipour, Michael Pradel, Martin Kellogg, Manu Sridharan
https://arxiv.org/abs/2507.20674 https://arxi…
Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge?
Arduin Findeis, Floris Weers, Guoli Yin, Ke Ye, Ruoming Pang, Tom Gunter
https://arxiv.org/abs/2507.17015
The iNaturalist Sounds Dataset
Mustafa Chasmai, Alexander Shepard, Subhransu Maji, Grant Van Horn
https://arxiv.org/abs/2506.00343 https://
Effective Multi-Task Learning for Biomedical Named Entity Recognition
Jo\~ao Ruano, Gon\c{c}alo M. Correia, Leonor Barreiros, Afonso Mendes
https://arxiv.org/abs/2507.18542 http…
Partial Weakly-Supervised Oriented Object Detection
Mingxin Liu, Peiyuan Zhang, Yuan Liu, Wei Zhang, Yue Zhou, Ning Liao, Ziyang Gong, Junwei Luo, Zhirui Wang, Yi Yu, Xue Yang
https://arxiv.org/abs/2507.02751
Beyond the Desktop: XR-Driven Segmentation with Meta Quest 3 and MX Ink
Lisle Faray de Paiva, Gijs Luijten, Ana Sofia Ferreira Santos, Moon Kim, Behrus Puladi, Jens Kleesiek, Jan Egger
https://arxiv.org/abs/2506.04858
Surgical Foundation Model Leveraging Compression and Entropy Maximization for Image-Guided Surgical Assistance
Lianhao Yin, Ozanan Meireles, Guy Rosman, Daniela Rus
https://arxiv.org/abs/2506.01980
Revisiting Active Learning under (Human) Label Variation
Cornelia Gruber, Helen Alber, Bernd Bischl, G\"oran Kauermann, Barbara Plank, Matthias A{\ss}enmacher
https://arxiv.org/abs/2507.02593
"Just a strange pic": Evaluating 'safety' in GenAI Image safety annotation tasks from diverse annotators' perspectives
Ding Wang, Mark D\'iaz, Charvi Rastogi, Aida Davani, Vinodkumar Prabhakaran, Pushkar Mishra, Roma Patel, Alicia Parrish, Zoe Ashwood, Michela Paganini, Tian Huey Teh, Verena Rieser, Lora Aroyo
https://
Self-Reinforcing Prototype Evolution with Dual-Knowledge Cooperation for Semi-Supervised Lifelong Person Re-Identification
Kunlun Xu, Fan Zhuo, Jiangmeng Li, Xu Zou, Jiahuan Zhou
https://arxiv.org/abs/2507.01884
Pixel Embedding Method for Tubular Neurite Segmentation
Huayu Fu, Jiamin Li, Haozhi Qu, Xiaolin Hu, Zengcai Guo
https://arxiv.org/abs/2507.23359 https://ar…
Pull Requests From The Classroom: Co-Developing Curriculum And Code
Dennis Zyska, Ilia Kuznetsov, Florian M\"uller, Iryna Gurevych
https://arxiv.org/abs/2508.00646 https://…
Transferable Modeling Strategies for Low-Resource LLM Tasks: A Prompt and Alignment-Based
Shuangquan Lyu, Yingnan Deng, Guiran Liu, Zhen Qi, Ruotong Wang
https://arxiv.org/abs/2507.00601
LesionGen: A Concept-Guided Diffusion Model for Dermatology Image Synthesis
Jamil Fayyad, Nourhan Bayasi, Ziyang Yu, Homayoun Najjaran
https://arxiv.org/abs/2507.23001 https://
Diffusion Model-based Data Augmentation Method for Fetal Head Ultrasound Segmentation
Fangyijie Wang, Kevin Whelan, F\'elix Balado, Gu\'enol\'e Silvestre, Kathleen M. Curran
https://arxiv.org/abs/2506.23664
Automated Label Placement on Maps via Large Language Models
Harry Shomer, Jiejun Xu
https://arxiv.org/abs/2507.22952 https://arxiv.org/pdf/2507.22952
MDC-R: The Minecraft Dialogue Corpus with Reference
Chris Madge, Maris Camilleri, Paloma Carretero Garcia, Mladen Karan, Juexi Shao, Prashant Jayannavar, Julian Hough, Benjamin Roth, Massimo Poesio
https://arxiv.org/abs/2506.22062
Efficient Learning for Product Attributes with Compact Multimodal Models
Mandar Kulkarni
https://arxiv.org/abs/2507.19679 https://arxiv.org/pdf/2507.19679
Towards Blind Bitstream-corrupted Video Recovery via a Visual Foundation Model-driven Framework
Tianyi Liu, Kejun Wu, Chen Cai, Yi Wang, Kim-Hui Yap, Lap-Pui Chau
https://arxiv.org/abs/2507.22481
Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content
Rian Touchent, Nathan Godey, Eric de la Clergerie
https://arxiv.org/abs/2506.20331
AdvMIM: Adversarial Masked Image Modeling for Semi-Supervised Medical Image Segmentation
Lei Zhu, Jun Zhou, Rick Siow Mong Goh, Yong Liu
https://arxiv.org/abs/2506.20563
SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning
Si-Woo Kim, MinJu Jeon, Ye-Chan Kim, Soeun Lee, Taewhan Kim, Dong-Jin Kim
https://arxiv.org/abs/2507.18616