2026-03-30 13:40:13
Fun playing ARC-AGI-3 , puzzles that the most advanced AI-models can only solve for 1% 😀
Illustrates how AI models look extremely smart but are at the same time quite dumb.
#AI
Fun playing ARC-AGI-3 , puzzles that the most advanced AI-models can only solve for 1% 😀
Illustrates how AI models look extremely smart but are at the same time quite dumb.
#AI
Democratizing Federated Learning with Blockchain and Multi-Task Peer Prediction
Leon Witt, Kentaroh Toyoda, Wojciech Samek, Dan Li
https://arxiv.org/abs/2603.28434 https://
Structural-Ambiguity-Aware Translation from Natural Language to Signal Temporal Logic
Kosei Fushimi, Kazunobu Serizawa, Junya Ikemoto, Kazumune Hashimoto
https://arxiv.org/abs/2603.28426 https://arxiv.org/pdf/2603.28426 https://arxiv.org/html/2603.28426
arXiv:2603.28426v1 Announce Type: new
Abstract: Signal Temporal Logic (STL) is widely used to specify timed and safety-critical tasks for cyber-physical systems, but writing STL formulas directly is difficult for non-expert users. Natural language (NL) provides a convenient interface, yet its inherent structural ambiguity makes one-to-one translation into STL unreliable. In this paper, we propose an \textit{ambiguity-preserving} method for translating NL task descriptions into STL candidate formulas. The key idea is to retain multiple plausible syntactic analyses instead of forcing a single interpretation at the parsing stage. To this end, we develop a three-stage pipeline based on Combinatory Categorial Grammar (CCG): ambiguity-preserving $n$-best parsing, STL-oriented template-based semantic composition, and canonicalization with score aggregation. The proposed method outputs a deduplicated set of STL candidates with plausibility scores, thereby explicitly representing multiple possible formal interpretations of an ambiguous instruction. In contrast to existing one-best NL-to-logic translation methods, the proposed approach is designed to preserve attachment and scope ambiguity. Case studies on representative task descriptions demonstrate that the method generates multiple STL candidates for genuinely ambiguous inputs while collapsing unambiguous or canonically equivalent derivations to a single STL formula.
toXiv_bot_toot
JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion
Anthony Chen, Naomi Ken Korem, Tavi Halperin, Matan Ben Yosef, Urska Jelercic, Ofir Bibi, Or Patashnik, Daniel Cohen-Or
https://arxiv.org/abs/2601.22143 https://arxiv.org/pdf/2601.22143 https://arxiv.org/html/2601.22143
arXiv:2601.22143v1 Announce Type: new
Abstract: Audio-Visual Foundation Models, which are pretrained to jointly generate sound and visual content, have recently shown an unprecedented ability to model multi-modal generation and editing, opening new opportunities for downstream tasks. Among these tasks, video dubbing could greatly benefit from such priors, yet most existing solutions still rely on complex, task-specific pipelines that struggle in real-world settings. In this work, we introduce a single-model approach that adapts a foundational audio-video diffusion model for video-to-video dubbing via a lightweight LoRA. The LoRA enables the model to condition on an input audio-video while jointly generating translated audio and synchronized facial motion. To train this LoRA, we leverage the generative model itself to synthesize paired multilingual videos of the same speaker. Specifically, we generate multilingual videos with language switches within a single clip, and then inpaint the face and audio in each half to match the language of the other half. By leveraging the rich generative prior of the audio-visual model, our approach preserves speaker identity and lip synchronization while remaining robust to complex motion and real-world dynamics. We demonstrate that our approach produces high-quality dubbed videos with improved visual fidelity, lip synchronization, and robustness compared to existing dubbing pipelines.
toXiv_bot_toot
It's been 40 years. I still remember it well. I was in school. It's a few days till my 10th bday. Our (very small, maybe 14 kids) 4th grade class stopped our studies and tuned in the launch. The surreal moment of watching the explosion grow while the announcer calmly continued reporting the stats before realizing there was a problem. The teacher having to explain what we just watched. The day we learned that going to space is still a very difficult task.
🎧 Earbuds can be used to monitor brain health
#sensors
New Bingo Boys album just came out today, and of COURSE it's a ripper.
#punk
Your task for today:
Opt out of #Copilot, because #Microslop forces you into it soon otherwise.
https://github.com/settings…
Don't miss my latest CSO feature that examines how boards don't need more cyber metrics; they need risk signals so they can better understand the exposure, trajectory, and consequences of the threats their organizations face.
Thanks to Richard Bejtlich, Mike Hamilton, Wendy Nather, George Tsantes, and Bernard Brantley for their insights.
Replaced article(s) found for cs.CL. https://arxiv.org/list/cs.CL/new
[4/5]:
- Retrieving Climate Change Disinformation by Narrative
Upravitelev, Solopova, Jakob, Sahitaj, M\"oller, Schmitt
https://arxiv.org/abs/2603.22015 https://mastoxiv.page/@arXiv_csCL_bot/116283633674519408
- PaperVoyager : Building Interactive Web with Visual Language Models
Dasen Dai, Biao Wu, Meng Fang, Wenhao Wang
https://arxiv.org/abs/2603.22999 https://mastoxiv.page/@arXiv_csCL_bot/116289015432093128
- Continual Robot Skill and Task Learning via Dialogue
Weiwei Gu, Suresh Kondepudi, Anmol Gupta, Lixiao Huang, Nakul Gopalan
https://arxiv.org/abs/2409.03166 https://mastoxiv.page/@arXiv_csRO_bot/113089412115632702
- Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs
Zara Siddique, Irtaza Khalid, Liam D. Turner, Luis Espinosa-Anke
https://arxiv.org/abs/2503.05371 https://mastoxiv.page/@arXiv_csLG_bot/114136994263573386
- SkillFlow: Scalable and Efficient Agent Skill Retrieval System
Fangzhou Li, Pagkratios Tagkopoulos, Ilias Tagkopoulos
https://arxiv.org/abs/2504.06188 https://mastoxiv.page/@arXiv_csAI_bot/114306773220502860
- Large Language Models for Computer-Aided Design: A Survey
Licheng Zhang, Bach Le, Naveed Akhtar, Siew-Kei Lam, Tuan Ngo
https://arxiv.org/abs/2505.08137 https://mastoxiv.page/@arXiv_csLG_bot/114504972217393639
- Structured Agent Distillation for Large Language Model
Liu, Kong, Dong, Yang, Li, Tang, Yuan, Niu, Zhang, Zhao, Lin, Huang, Wang
https://arxiv.org/abs/2505.13820 https://mastoxiv.page/@arXiv_csLG_bot/114544636506163783
- VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
Fan, Zhang, Li, Zhang, Chen, Hu, Wang, Qu, Zhou, Wang, Yan, Xu, Theiss, Chen, Li, Tu, Wang, Ranjan
https://arxiv.org/abs/2505.20279 https://mastoxiv.page/@arXiv_csCV_bot/114578817567171199
- Learning to Diagnose Privately: DP-Powered LLMs for Radiology Report Classification
Bhattacharjee, Tian, Rubin, Lo, Merchant, Hanson, Gounley, Tandon
https://arxiv.org/abs/2506.04450 https://mastoxiv.page/@arXiv_csCR_bot/114635189706505648
- L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search
Ziqi Wang, Boqin Yuan
https://arxiv.org/abs/2509.00761 https://mastoxiv.page/@arXiv_csAI_bot/115140304787881576
- Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking
Han, Huang, Liao, Jiang, Lu, Zhao, Wang, Zhou, Jiang, Liang, Zhou, Sun, Yu, Xiao
https://arxiv.org/abs/2509.23392 https://mastoxiv.page/@arXiv_csAI_bot/115293169353788311
- Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models
Leander Girrbach, Stephan Alaniz, Genevieve Smith, Trevor Darrell, Zeynep Akata
https://arxiv.org/abs/2510.03721 https://mastoxiv.page/@arXiv_csCV_bot/115332690912652473
- Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Zhang, Hu, Upasani, Ma, Hong, Kamanuru, Rainton, Wu, Ji, Li, Thakker, Zou, Olukotun
https://arxiv.org/abs/2510.04618 https://mastoxiv.page/@arXiv_csLG_bot/115332999596603375
- Mitigating Premature Exploitation in Particle-based Monte Carlo for Inference-Time Scaling
Giannone, Xu, Nayak, Awhad, Sudalairaj, Xu, Srivastava
https://arxiv.org/abs/2510.05825 https://mastoxiv.page/@arXiv_csLG_bot/115338159696513898
- Complete asymptotic type-token relationship for growing complex systems with inverse power-law co...
Pablo Rosillo-Rodes, Laurent H\'ebert-Dufresne, Peter Sheridan Dodds
https://arxiv.org/abs/2511.02069 https://mastoxiv.page/@arXiv_physicssocph_bot/115496283627867809
- ViPRA: Video Prediction for Robot Actions
Sandeep Routray, Hengkai Pan, Unnat Jain, Shikhar Bahl, Deepak Pathak
https://arxiv.org/abs/2511.07732 https://mastoxiv.page/@arXiv_csRO_bot/115535941444003568
- AISAC: An Integrated multi-agent System for Transparent, Retrieval-Grounded Scientific Assistance
Chandrachur Bhattacharya, Sibendu Som
https://arxiv.org/abs/2511.14043
- VideoARM: Agentic Reasoning over Hierarchical Memory for Long-Form Video Understanding
Yufei Yin, Qianke Meng, Minghao Chen, Jiajun Ding, Zhenwei Shao, Zhou Yu
https://arxiv.org/abs/2512.12360 https://mastoxiv.page/@arXiv_csCV_bot/115729238732682644
- RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering
L\'eo Butsanets, Charles Corbi\`ere, Julien Khlaut, Pierre Manceron, Corentin Dancette
https://arxiv.org/abs/2512.17396 https://mastoxiv.page/@arXiv_csCV_bot/115762705911757243
- Measuring all the noises of LLM Evals
Sida Wang
https://arxiv.org/abs/2512.21326 https://mastoxiv.page/@arXiv_csLG_bot/115779597137785637
toXiv_bot_toot
Forty years ago, 21 people gathered for the first meeting of what became the Internet Engineering Task Force or #IETF . Every day billions of people use the open standards and technologies developed in the IETF. And nearly 8000 volunteer IETF participants from around the world collaborate in more than 100 working groups evolving those open standards and making the Internet work better!
Senior military cyber operator removed from Russia task force https://therecord.media/senior-military-cyber-op-removed-russia-task-force
Statistical Query Lower Bounds for Smoothed Agnostic Learning
Ilias Diakonikolas, Daniel M. Kane
https://arxiv.org/abs/2602.21191 https://arxiv.org/pdf/2602.21191 https://arxiv.org/html/2602.21191
arXiv:2602.21191v1 Announce Type: new
Abstract: We study the complexity of smoothed agnostic learning, recently introduced by~\cite{CKKMS24}, in which the learner competes with the best classifier in a target class under slight Gaussian perturbations of the inputs. Specifically, we focus on the prototypical task of agnostically learning halfspaces under subgaussian distributions in the smoothed model. The best known upper bound for this problem relies on $L_1$-polynomial regression and has complexity $d^{\tilde{O}(1/\sigma^2) \log(1/\epsilon)}$, where $\sigma$ is the smoothing parameter and $\epsilon$ is the excess error. Our main result is a Statistical Query (SQ) lower bound providing formal evidence that this upper bound is close to best possible. In more detail, we show that (even for Gaussian marginals) any SQ algorithm for smoothed agnostic learning of halfspaces requires complexity $d^{\Omega(1/\sigma^{2} \log(1/\epsilon))}$. This is the first non-trivial lower bound on the complexity of this task and nearly matches the known upper bound. Roughly speaking, we show that applying $L_1$-polynomial regression to a smoothed version of the function is essentially best possible. Our techniques involve finding a moment-matching hard distribution by way of linear programming duality. This dual program corresponds exactly to finding a low-degree approximating polynomial to the smoothed version of the target function (which turns out to be the same condition required for the $L_1$-polynomial regression to work). Our explicit SQ lower bound then comes from proving lower bounds on this approximation degree for the class of halfspaces.
toXiv_bot_toot
What if our networks could do more than just carry data?
In December, the Fibre Sensing Task of the GÉANT (GN5-2) Project, together with SURF @… turned 57 km of live optical fibre into a sensor.
The result? Detecting everything from trams to a plane landing at Amsterdam Airport Schiphol.
🎥 Watch Chris Atherton walk us through the experiment.
Oh well, our upcoming client doesn't provide cc files for each shot. Instead, we need to extract the grading values from an EDL file. Fortunately I've already written a script to do that a few years ago for another show.
The time to write a script might be more than what it takes to do the task manually (here it would be copying values from a text file to an xml file). But it pays off if you have to repeat the task. Even if that is 7 years later.
Oh happy task. #acadamicchatter
So I’ve got a new gig of sorts. I’ll be a volunteer photographer for the city parks system. The task will be to take pics of folks participating in various programs in the parks and natural areas run by other volunteers, to try and capture that attendees are having fun, especially the kids. The difficulty level is that I’ve never liked photographing people. Go out of my way to keep them out of shots. I’ve done a couple events now, and still just feel intrusive. Hope it gets easier.
Automating Computational Chemistry Workflows via OpenClaw and Domain-Specific Skills
Mingwei Ding, Chen Huang, Yibo Hu, Yifan Li, Zitian Lu, Xingtai Yu, Duo Zhang, Wenxi Zhai, Tong Zhu, Qiangqiang Gu, Jinzhe Zeng
https://arxiv.org/abs/2603.25522 https://arxiv.org/pdf/2603.25522 https://arxiv.org/html/2603.25522
arXiv:2603.25522v1 Announce Type: new
Abstract: Automating multistep computational chemistry tasks remains challenging because reasoning, workflow specification, software execution, and high-performance computing (HPC) execution are often tightly coupled. We demonstrate a decoupled agent-skill design for computational chemistry automation leveraging OpenClaw. Specifically, OpenClaw provides centralized control and supervision; schema-defined planning skills translate scientific goals into executable task specifications; domain skills encapsulate specific computational chemistry procedures; and DPDispatcher manages job execution across heterogeneous HPC environments. In a molecular dynamics (MD) case study of methane oxidation, the system completed cross-tool execution, bounded recovery from runtime failures, and reaction network extraction, illustrating a scalable and maintainable approach to multistep computational chemistry automation.
toXiv_bot_toot
Quite tempted to occupy a section of Tony Blair’s property and build a house on it. You know, so he can really understand the task he has so gleefully taken on.
Obviously, I wouldn’t really do this. I’d be stuck with dreadful neighbours.
Exploring Performance-Productivity Trade-offs in AMT Runtimes: A Task Bench Study of Itoyori, ItoyoriFBC, HPX, and MPI
Torben R. Lahnor, Mia Reitz, Jonas Posner, Patrick Diehl
https://arxiv.org/abs/2601.14608
Today I made a 2 line change to a file on GitHub. Copilot suggested a spectacularly incorrect summary of my change for the commit message, so I deleted it, finished the commit, and asked CoPilot “how do I disable CoPilot commit message suggestions.” THAT task was in its wheelhouse. #AIslop
Cowboys DC Christian Parker on new scheme: 'You build it around the players' https://www.nfl.com/news/cowboys-dc-christian-parker-new-scheme-you-build-it-around-the-players
OK..news from the "involuntary admin" front.
Infos I had from my partners mother:
"I can do nothing on my laptop any more. There's always a message 'no permission' when I try X or Y. And it said deleting cookies may help so I tried. But something went wrong and I was afraid to continue"
What I saw when I checked the laptop:
- Firefox icon in task bar and desktop blank
- Chrome icon in task bar and desktop blank
- Virus scanner…
🎧 Here's a short excerpt from this week's #podcast, where I use my own example of addressing procrastination- finally taking action on a task that's been on my list for weeks!
https://youtu.be/KJSaXw4CUEs…
I've just finished a modestly onerous administrative task.
It lead me to think about effort and the 3 legs of my job: research, teaching and admin: While there are intrinsic reasons to do research and teaching to the best of your ability, for admin "good enough" is enough. Focus on efficiency not excellence, where efficiency includes making things work, not causing extra hassle in the medium term.
Vänder på min task-priority list: Väntar på en kollega eftersom han är admin på en server som jag fortf inte kommer åt. Så jag skrev ett bashscript som han kan köra för att lägga mig i sudoers. Svårt att fixa till css på en server när en vare sig kommer in eller får göra ändringar i nano (jag vet, det finns vim också, men det är inte där mitt muskelminne finns)
Been another eventful week around the "new" house. Got the shed built. What a pain in the ass metal sheds are. This one was no different. Shout-out to whoever decided it was a great idea to plastic wrap every painted panel like they were PC case panels. I hope you stub a pinky toe every other day for the rest of your life.
With that done, next task was to put up the shelf frames I brought with me. Was able to get 3 shelves cut out of the former back porch ramp plywood. Got 3…
RE: #uspol
Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs
Yining Hong, Huang Huang, Manling Li, Li Fei-Fei, Jiajun Wu, Yejin Choi
https://arxiv.org/abs/2602.21198 https://arxiv.org/pdf/2602.21198 https://arxiv.org/html/2602.21198
arXiv:2602.21198v1 Announce Type: new
Abstract: Embodied LLMs endow robots with high-level task reasoning, but they cannot reflect on what went wrong or why, turning deployment into a sequence of independent trials where mistakes repeat rather than accumulate into experience. Drawing upon human reflective practitioners, we introduce Reflective Test-Time Planning, which integrates two modes of reflection: \textit{reflection-in-action}, where the agent uses test-time scaling to generate and score multiple candidate actions using internal reflections before execution; and \textit{reflection-on-action}, which uses test-time training to update both its internal reflection model and its action policy based on external reflections after execution. We also include retrospective reflection, allowing the agent to re-evaluate earlier decisions and perform model updates with hindsight for proper long-horizon credit assignment. Experiments on our newly-designed Long-Horizon Household benchmark and MuJoCo Cupboard Fitting benchmark show significant gains over baseline models, with ablative studies validating the complementary roles of reflection-in-action and reflection-on-action. Qualitative analyses, including real-robot trials, highlight behavioral correction through reflection.
toXiv_bot_toot
Raiders Get Compelling Words Over Defensive Coordinator Search https://heavy.com/sports/nfl/las-vegas-raiders/compelling-words-defensive-coordinator-search/
AOC: It’s our task to figure out how to claw back what has essentially supercharged this agency into becoming a relentless domestic paramilitary that is also a blank check to Palantir to create facial-recognition scans on US citizens.
Source:
https://reddit.com/comments/1qug0xd
Prime Minister of Canada Mark Carney speech at Davos. It was a good one. This is how he ended it, but it is worth watching in full, including the Q&A afterward.
“We know the old order is not coming back. We shouldn’t mourn it. Nostalgia is not a strategy, but we believe that from the fracture we can build something bigger, better, stronger, more just. This is the task of the middle powers, the countries that have the most to lose from a world of fortresses and the most to gain from genuine cooperation.
The powerful have their power. But we have something too: the capacity to stop pretending, to name realities, to build our strength at home, and to act together.
That is Canada’s path. We choose it openly and confidently, and it is a path wide open to any country willing to take it with us.”
#CanPoli #CdnPoli #Canada #USA
From Isolation to Integration: Building an Adaptive Expert Forest for Pre-Trained Model-based Class-Incremental Learning
Ruiqi Liu, Boyu Diao, Hangda Liu, Zhulin An, Fei Wang, Yongjun Xu
https://arxiv.org/abs/2602.20911 https://arxiv.org/pdf/2602.20911 https://arxiv.org/html/2602.20911
arXiv:2602.20911v1 Announce Type: new
Abstract: Class-Incremental Learning (CIL) requires models to learn new classes without forgetting old ones. A common method is to freeze a pre-trained model and train a new, lightweight adapter for each task. While this prevents forgetting, it treats the learned knowledge as a simple, unstructured collection and fails to use the relationships between tasks. To this end, we propose the Semantic-guided Adaptive Expert Forest (SAEF), a new method that organizes adapters into a structured hierarchy for better knowledge sharing. SAEF first groups tasks into conceptual clusters based on their semantic relationships. Then, within each cluster, it builds a balanced expert tree by creating new adapters from merging the adapters of similar tasks. At inference time, SAEF finds and activates a set of relevant experts from the forest for any given input. The final prediction is made by combining the outputs of these activated experts, weighted by how confident each expert is. Experiments on several benchmark datasets show that SAEF achieves SOTA performance.
toXiv_bot_toot
Genus-0 Surface Parameterization using Spherical Beltrami Differentials
Zhehao Xu, Lok Ming Lui
https://arxiv.org/abs/2602.01589 https://arxiv.org/pdf/2602.01589 https://arxiv.org/html/2602.01589
arXiv:2602.01589v1 Announce Type: new
Abstract: Spherical surface parameterization is a fundamental tool in geometry processing and imaging science. For a genus-0 closed surface, many efficient algorithms can map the surface to the sphere; consequently, a broad class of task-driven genus-0 mapping problems can be reduced to constructing a high-quality spherical self-map. However, existing approaches often face a trade-off between satisfying task objectives (e.g., landmark or feature alignment), maintaining bijectivity, and controlling geometric distortion. We introduce the Spherical Beltrami Differential (SBD), a two-chart representation of quasiconformal self-maps of the sphere, and establish its correspondence with spherical homeomorphisms up to conformal automorphisms. Building on the Spectral Beltrami Network (SBN), we propose a neural optimization framework BOOST that optimizes two Beltrami fields on hemispherical stereographic charts and enforces global consistency through explicit seam-aware constraints. Experiments on large-deformation landmark matching and intensity-based spherical registration demonstrate the effectiveness of our proposed framework. We further apply the method to brain cortical surface registration, aligning sulcal landmarks and jointly matching cortical sulci depth maps, showing improved task fidelity with controlled distortion and robust bijective behavior.
toXiv_bot_toot
Things you're able to do in a VFX pipeline but it would suck:
1. work with internal shot names that are different from what the client is using.
2. change version numbers of files sent out to the client to hide your internal number of revisions.
Things that make a good VFX pipeline:
1. have artists work with the same task names ("comp_v03") across shows and have scripts rename files you upload if a client demands it ("cmp_v0003")
The forme…
Exploring the Impact of Parameter Update Magnitude on Forgetting and Generalization of Continual Learning
JinLi He, Liang Bai, Xian Yang
https://arxiv.org/abs/2602.20796 https://arxiv.org/pdf/2602.20796 https://arxiv.org/html/2602.20796
arXiv:2602.20796v1 Announce Type: new
Abstract: The magnitude of parameter updates are considered a key factor in continual learning. However, most existing studies focus on designing diverse update strategies, while a theoretical understanding of the underlying mechanisms remains limited. Therefore, we characterize model's forgetting from the perspective of parameter update magnitude and formalize it as knowledge degradation induced by task-specific drift in the parameter space, which has not been fully captured in previous studies due to their assumption of a unified parameter space. By deriving the optimal parameter update magnitude that minimizes forgetting, we unify two representative update paradigms, frozen training and initialized training, within an optimization framework for constrained parameter updates. Our theoretical results further reveals that sequence tasks with small parameter distances exhibit better generalization and less forgetting under frozen training rather than initialized training. These theoretical insights inspire a novel hybrid parameter update strategy that adaptively adjusts update magnitude based on gradient directions. Experiments on deep neural networks demonstrate that this hybrid approach outperforms standard training strategies, providing new theoretical perspectives and practical inspiration for designing efficient and scalable continual learning algorithms.
toXiv_bot_toot
Proc3D: Procedural 3D Generation and Parametric Editing of 3D Shapes with Large Language Models
Fadlullah Raji, Stefano Petrangeli, Matheus Gadelha, Yu Shen, Uttaran Bhattacharya, Gang Wu
https://arxiv.org/abs/2601.12234 https://arxiv.org/pdf/2601.12234 https://arxiv.org/html/2601.12234
arXiv:2601.12234v1 Announce Type: new
Abstract: Generating 3D models has traditionally been a complex task requiring specialized expertise. While recent advances in generative AI have sought to automate this process, existing methods produce non-editable representation, such as meshes or point clouds, limiting their adaptability for iterative design. In this paper, we introduce Proc3D, a system designed to generate editable 3D models while enabling real-time modifications. At its core, Proc3D introduces procedural compact graph (PCG), a graph representation of 3D models, that encodes the algorithmic rules and structures necessary for generating the model. This representation exposes key parameters, allowing intuitive manual adjustments via sliders and checkboxes, as well as real-time, automated modifications through natural language prompts using Large Language Models (LLMs). We demonstrate Proc3D's capabilities using two generative approaches: GPT-4o with in-context learning (ICL) and a fine-tuned LLAMA-3 model. Experimental results show that Proc3D outperforms existing methods in editing efficiency, achieving more than 400x speedup over conventional approaches that require full regeneration for each modification. Additionally, Proc3D improves ULIP scores by 28%, a metric that evaluates the alignment between generated 3D models and text prompts. By enabling text-aligned 3D model generation along with precise, real-time parametric edits, Proc3D facilitates highly accurate text-based image editing applications.
toXiv_bot_toot
Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[4/6]:
- Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints
Chuqin Geng, Li Zhang, Mark Zhang, Haolin Ye, Ziyu Zhao, Xujie Si
https://arxiv.org/abs/2602.16954 https://mastoxiv.page/@arXiv_csLG_bot/116102434757760085
- Multi-Probe Zero Collision Hash (MPZCH): Mitigating Embedding Collisions and Enhancing Model Fres...
Ziliang Zhao, et al.
https://arxiv.org/abs/2602.17050 https://mastoxiv.page/@arXiv_csLG_bot/116102517335590034
- MASPO: Unifying Gradient Utilization, Probability Mass, and Signal Reliability for Robust and Sam...
Fu, Lin, Fang, Zheng, Hu, Shao, Qin, Pan, Zeng, Cai
https://arxiv.org/abs/2602.17550 https://mastoxiv.page/@arXiv_csLG_bot/116102581561441103
- A Theoretical Framework for Modular Learning of Robust Generative Models
Corinna Cortes, Mehryar Mohri, Yutao Zhong
https://arxiv.org/abs/2602.17554 https://mastoxiv.page/@arXiv_csLG_bot/116102582216715527
- Multi-Round Human-AI Collaboration with User-Specified Requirements
Sima Noorani, Shayan Kiyani, Hamed Hassani, George Pappas
https://arxiv.org/abs/2602.17646 https://mastoxiv.page/@arXiv_csLG_bot/116102592047544971
- NEXUS: A compact neural architecture for high-resolution spatiotemporal air quality forecasting i...
Rampunit Kumar, Aditya Maheshwari
https://arxiv.org/abs/2602.19654 https://mastoxiv.page/@arXiv_csLG_bot/116125610403473755
- Augmenting Lateral Thinking in Language Models with Humor and Riddle Data for the BRAINTEASER Task
Mina Ghashami, Soumya Smruti Mishra
https://arxiv.org/abs/2405.10385 https://mastoxiv.page/@arXiv_csCL_bot/112472190479013167
- Watermarking Language Models with Error Correcting Codes
Patrick Chao, Yan Sun, Edgar Dobriban, Hamed Hassani
https://arxiv.org/abs/2406.10281 https://mastoxiv.page/@arXiv_csCR_bot/112636307340218522
- Learning to Control Unknown Strongly Monotone Games
Siddharth Chandak, Ilai Bistritz, Nicholas Bambos
https://arxiv.org/abs/2407.00575 https://mastoxiv.page/@arXiv_csMA_bot/112715733875586837
- Classification and reconstruction for single-pixel imaging with classical and quantum neural netw...
Sofya Manko, Dmitry Frolovtsev
https://arxiv.org/abs/2407.12506 https://mastoxiv.page/@arXiv_quantph_bot/112806295477530195
- Statistical Inference for Temporal Difference Learning with Linear Function Approximation
Weichen Wu, Gen Li, Yuting Wei, Alessandro Rinaldo
https://arxiv.org/abs/2410.16106 https://mastoxiv.page/@arXiv_statML_bot/113350611306532443
- Big data approach to Kazhdan-Lusztig polynomials
Abel Lacabanne, Daniel Tubbenhauer, Pedro Vaz
https://arxiv.org/abs/2412.01283 https://mastoxiv.page/@arXiv_mathRT_bot/113587812663608119
- MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition
Mehran Shabanpour, Kasra Rad, Sadaf Khademi, Arash Mohammadi
https://arxiv.org/abs/2502.17457 https://mastoxiv.page/@arXiv_eessSP_bot/114069047434302054
- Tightening Optimality gap with confidence through conformal prediction
Miao Li, Michael Klamkin, Russell Bent, Pascal Van Hentenryck
https://arxiv.org/abs/2503.04071 https://mastoxiv.page/@arXiv_statML_bot/114120074927291283
- SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding
Juhyeon Park, Peter Yongho Kim, Jiook Cha, Shinjae Yoo, Taesup Moon
https://arxiv.org/abs/2503.06437 https://mastoxiv.page/@arXiv_csCV_bot/114142690988862508
- How much does context affect the accuracy of AI health advice?
Prashant Garg, Thiemo Fetzer
https://arxiv.org/abs/2504.18310 https://mastoxiv.page/@arXiv_econGN_bot/114414380916957986
- Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification
Daniel J. Strick, Carlos Garcia, Anthony Huang, Thomas Gardos
https://arxiv.org/abs/2505.06646 https://mastoxiv.page/@arXiv_eessIV_bot/114499319986528625
- Sharp Gaussian approximations for Decentralized Federated Learning
Soham Bonnerjee, Sayar Karmakar, Wei Biao Wu
https://arxiv.org/abs/2505.08125 https://mastoxiv.page/@arXiv_statML_bot/114505047719395949
- HoloLLM: Multisensory Foundation Model for Language-Grounded Human Sensing and Reasoning
Chuhao Zhou, Jianfei Yang
https://arxiv.org/abs/2505.17645 https://mastoxiv.page/@arXiv_csCV_bot/114572928659057348
- A Copula Based Supervised Filter for Feature Selection in Diabetes Risk Prediction Using Machine ...
Agnideep Aich, Md Monzur Murshed, Sameera Hewage, Amanda Mayeaux
https://arxiv.org/abs/2505.22554 https://mastoxiv.page/@arXiv_statML_bot/114589983451462525
- Synthesis of discrete-continuous quantum circuits with multimodal diffusion models
Florian F\"urrutter, Zohim Chandani, Ikko Hamamura, Hans J. Briegel, Gorka Mu\~noz-Gil
https://arxiv.org/abs/2506.01666 https://mastoxiv.page/@arXiv_quantph_bot/114618420761346125
toXiv_bot_toot