Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.

It wouldn't hurt to contact your representatives in Congress,
ask them to assert themselves as is their responsibility,
and bring a hasty end to this painful conflict.
Otherwise, the entire world is now sitting on a bus being driven recklessly by American leaders with uncertain and varying goals and dubious judgment.
Fasten your seatbelts.

@arXiv_csGR_bot@mastoxiv.page
2026-01-30 08:28:26

JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion
Anthony Chen, Naomi Ken Korem, Tavi Halperin, Matan Ben Yosef, Urska Jelercic, Ofir Bibi, Or Patashnik, Daniel Cohen-Or
arxiv.org/abs/2601.22143 arxiv.org/pdf/2601.22143 arxiv.org/html/2601.22143
arXiv:2601.22143v1 Announce Type: new
Abstract: Audio-Visual Foundation Models, which are pretrained to jointly generate sound and visual content, have recently shown an unprecedented ability to model multi-modal generation and editing, opening new opportunities for downstream tasks. Among these tasks, video dubbing could greatly benefit from such priors, yet most existing solutions still rely on complex, task-specific pipelines that struggle in real-world settings. In this work, we introduce a single-model approach that adapts a foundational audio-video diffusion model for video-to-video dubbing via a lightweight LoRA. The LoRA enables the model to condition on an input audio-video while jointly generating translated audio and synchronized facial motion. To train this LoRA, we leverage the generative model itself to synthesize paired multilingual videos of the same speaker. Specifically, we generate multilingual videos with language switches within a single clip, and then inpaint the face and audio in each half to match the language of the other half. By leveraging the rich generative prior of the audio-visual model, our approach preserves speaker identity and lip synchronization while remaining robust to complex motion and real-world dynamics. We demonstrate that our approach produces high-quality dubbed videos with improved visual fidelity, lip synchronization, and robustness compared to existing dubbing pipelines.
toXiv_bot_toot

@brian_gettler@mas.to
2026-01-29 01:03:52

Never forget, #Toronto, you're not stuck in traffic because there are too many damn cars in this city. And this week you're not stuck because of record snowfall clogging roads. (As an aside, if you're complaining about snow on the roads, have you had a look at our sidewalks recently?) No, you're sitting there like a fool because of bike lanes.

@grumpybozo@toad.social
2026-02-27 21:22:17

I am regretting my choice to leave chemistry…
I don’t think anyone had battery chemistry on their list of critical technologies of the future 40 years ago. Now it seems like every week there’s a new idea.
hachyderm.io/@BenjaminHCCarr/1

@mgorny@social.treehouse.systems
2026-03-28 09:40:42

Our society has learned to promote #complacency into a virtue. You call it a "moderate stance", and it's suddenly a good thing. Opposing evil is bad; it's extremist position, almost as bad as the evil itself. Complacency sounds bad too. But "hey, I don't support evil, I just keep an open mind, a moderate position here", and you're suddenly a praiseworthy person. Maybe "just a little, necessary amount of evil" is good, after all.
"I don't support slavery. I just want cheap goods, and I don't want to know how come they're cheap."
"I don't support animal cruelty, I just want cheap meat, and I don't want anyone to point out to me why it's cheap."
"I am tolerant of LGBTQ people, I just don't wanna see them."
"I don't want disabled people to die, I just expect that they find a job."

@thomasfuchs@hachyderm.io
2026-02-27 15:31:14

RE: hachyderm.io/@thomasfuchs/1161
Software development these days really reminds me of what large political parties do—no shred of thinking about the whole, not even a little bit of conviction, just churning out mid ever more milquetoast "features" designed to eek out a little more compliance.
Moral stances and ethics be damned.

@mnalis@mastodon.online
2026-03-27 22:53:59

#overpass servers for #open are overloaded lately, likely due to all #AI bots and whatnot.
Only workaround seems to be setting up your own local instance. Luckily, there exists Docker container which ma…

@arXiv_csGR_bot@mastoxiv.page
2026-01-30 08:16:47

Mesh Splatting for End-to-end Multiview Surface Reconstruction
Ruiqi Zhang, Jiacheng Wu, Jie Chen
arxiv.org/abs/2601.21400 arxiv.org/pdf/2601.21400 arxiv.org/html/2601.21400
arXiv:2601.21400v1 Announce Type: new
Abstract: Surfaces are typically represented as meshes, which can be extracted from volumetric fields via meshing or optimized directly as surface parameterizations. Volumetric representations occupy 3D space and have a large effective receptive field along rays, enabling stable and efficient optimization via volumetric rendering; however, subsequent meshing often produces overly dense meshes and introduces accumulated errors. In contrast, pure surface methods avoid meshing but capture only boundary geometry with a single-layer receptive field, making it difficult to learn intricate geometric details and increasing reliance on priors (e.g., shading or normals). We bridge this gap by differentiably turning a surface representation into a volumetric one, enabling end-to-end surface reconstruction via volumetric rendering to model complex geometries. Specifically, we soften a mesh into multiple semi-transparent layers that remain differentiable with respect to the base mesh, endowing it with a controllable 3D receptive field. Combined with a splatting-based renderer and a topology-control strategy, our method can be optimized in about 20 minutes to achieve accurate surface reconstruction while substantially improving mesh quality.
toXiv_bot_toot