Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csCV_bot@mastoxiv.page
2025-08-22 10:05:31

An Empirical Study on How Video-LLMs Answer Video Questions
Chenhui Gou, Ziyu Ma, Zicheng Duan, Haoyu He, Feng Chen, Akide Liu, Bohan Zhuang, Jianfei Cai, Hamid Rezatofighi
arxiv.org/abs/2508.15360

@arXiv_csCV_bot@mastoxiv.page
2025-08-22 10:18:41

StreamMem: Query-Agnostic KV Cache Memory for Streaming Video Understanding
Yanlai Yang, Zhuokai Zhao, Satya Narayan Shukla, Aashu Singh, Shlok Kumar Mishra, Lizhu Zhang, Mengye Ren
arxiv.org/abs/2508.15717

@arXiv_csCV_bot@mastoxiv.page
2025-08-22 10:19:21

Waver: Wave Your Way to Lifelike Video Generation
Yifu Zhang, Hao Yang, Yuqi Zhang, Yifei Hu, Fengda Zhu, Chuang Lin, Xiaofeng Mei, Yi Jiang, Zehuan Yuan, Bingyue Peng
arxiv.org/abs/2508.15761

@tinoeberl@mastodon.online
2025-08-20 10:12:32

Im #Windpark #Elster werden ältere #Windräder zurückgebaut und durch leistungsstärkere ersetzt.
Durch das #Repowering

@aardrian@toot.cafe
2025-08-14 03:53:21

Kickstarted a credit-card-sized dice replacement.
For IRL D&D it was fine, but dials spin forever. Made dice tray pointless, while less satisfying. Even though I choose yellow text on black, hard for me to see in mood-lit basement on table
For Shadowrun, 6D6 is embarrassingly insufficient, given I regularly roll 14D6. The VTT roller is far better (since it does the math on pools and successes).
But they’re sturdy and hefty! Which means my mini metal dice are lighter an…

A thick black metal card with 6 individual dials, each representing a D4, D6, D8, D10, D12, and D20, as represented in tiny yellow text. The card sits in a small dice tray on a white table next to a hand-drawn map and a dice box holding a full set of glittery pink fantasy dice.
A thick metal card in sparkly purple or green, depending on the angle, with 6 dials, each representing a D6 with tiny white text. It’s in my hand held up in front of a computer screen showing cyberpunk character art, diagnostics, and game notes.
@arXiv_csCV_bot@mastoxiv.page
2025-08-21 10:12:50

Repeating Words for Video-Language Retrieval with Coarse-to-Fine Objectives
Haoyu Zhao, Jiaxi Gu, Shicong Wang, Xing Zhang, Hang Xu, Zuxuan Wu, Yu-Gang Jiang
arxiv.org/abs/2508.14812

@arXiv_csCV_bot@mastoxiv.page
2025-08-18 09:55:30

DashCam Video: A complementary low-cost data stream for on-demand forest-infrastructure system monitoring
Durga Joshi (Department of Natural Resources and the Environment, Eversource Energy Center, University of Connecticut, Storrs, CT, USA), Chandi Witharana (Department of Natural Resources and the Environment, Eversource Energy Center, University of Connecticut, Storrs, CT, USA), Robert Fahey (Department of Natural Resources and the Environment, Eversource Energy Center, University o…

@arXiv_csCV_bot@mastoxiv.page
2025-08-21 10:08:00

SMTrack: End-to-End Trained Spiking Neural Networks for Multi-Object Tracking in RGB Videos
Pengzhi Zhong, Xinzhe Wang, Dan Zeng, Qihua Zhou, Feixiang He, Shuiwang Li
arxiv.org/abs/2508.14607

@arXiv_csCV_bot@mastoxiv.page
2025-07-18 10:21:22

Taming Diffusion Transformer for Real-Time Mobile Video Generation
Yushu Wu, Yanyu Li, Anil Kag, Ivan Skorokhodov, Willi Menapace, Ke Ma, Arpit Sahni, Ju Hu, Aliaksandr Siarohin, Dhritiman Sagar, Yanzhi Wang, Sergey Tulyakov
arxiv.org/abs/2507.13343

@arXiv_csCV_bot@mastoxiv.page
2025-09-16 12:44:17

Open-ended Hierarchical Streaming Video Understanding with Vision Language Models
Hyolim Kang, Yunsu Park, Youngbeom Yoo, Yeeun Choi, Seon Joo Kim
arxiv.org/abs/2509.12145