An Empirical Study on How Video-LLMs Answer Video Questions
Chenhui Gou, Ziyu Ma, Zicheng Duan, Haoyu He, Feng Chen, Akide Liu, Bohan Zhuang, Jianfei Cai, Hamid Rezatofighi
https://arxiv.org/abs/2508.15360
StreamMem: Query-Agnostic KV Cache Memory for Streaming Video Understanding
Yanlai Yang, Zhuokai Zhao, Satya Narayan Shukla, Aashu Singh, Shlok Kumar Mishra, Lizhu Zhang, Mengye Ren
https://arxiv.org/abs/2508.15717
Waver: Wave Your Way to Lifelike Video Generation
Yifu Zhang, Hao Yang, Yuqi Zhang, Yifei Hu, Fengda Zhu, Chuang Lin, Xiaofeng Mei, Yi Jiang, Zehuan Yuan, Bingyue Peng
https://arxiv.org/abs/2508.15761 …
Im #Windpark #Elster werden ältere #Windräder zurückgebaut und durch leistungsstärkere ersetzt.
Durch das #Repowering
Kickstarted a credit-card-sized dice replacement.
For IRL D&D it was fine, but dials spin forever. Made dice tray pointless, while less satisfying. Even though I choose yellow text on black, hard for me to see in mood-lit basement on table
For Shadowrun, 6D6 is embarrassingly insufficient, given I regularly roll 14D6. The VTT roller is far better (since it does the math on pools and successes).
But they’re sturdy and hefty! Which means my mini metal dice are lighter an…
Repeating Words for Video-Language Retrieval with Coarse-to-Fine Objectives
Haoyu Zhao, Jiaxi Gu, Shicong Wang, Xing Zhang, Hang Xu, Zuxuan Wu, Yu-Gang Jiang
https://arxiv.org/abs/2508.14812
DashCam Video: A complementary low-cost data stream for on-demand forest-infrastructure system monitoring
Durga Joshi (Department of Natural Resources and the Environment, Eversource Energy Center, University of Connecticut, Storrs, CT, USA), Chandi Witharana (Department of Natural Resources and the Environment, Eversource Energy Center, University of Connecticut, Storrs, CT, USA), Robert Fahey (Department of Natural Resources and the Environment, Eversource Energy Center, University o…
SMTrack: End-to-End Trained Spiking Neural Networks for Multi-Object Tracking in RGB Videos
Pengzhi Zhong, Xinzhe Wang, Dan Zeng, Qihua Zhou, Feixiang He, Shuiwang Li
https://arxiv.org/abs/2508.14607
Taming Diffusion Transformer for Real-Time Mobile Video Generation
Yushu Wu, Yanyu Li, Anil Kag, Ivan Skorokhodov, Willi Menapace, Ke Ma, Arpit Sahni, Ju Hu, Aliaksandr Siarohin, Dhritiman Sagar, Yanzhi Wang, Sergey Tulyakov
https://arxiv.org/abs/2507.13343
Open-ended Hierarchical Streaming Video Understanding with Vision Language Models
Hyolim Kang, Yunsu Park, Youngbeom Yoo, Yeeun Choi, Seon Joo Kim
https://arxiv.org/abs/2509.12145