A friend of mine built this thing called Pysey:
"A desktop emulator and development environment for Pygame-based video synthesis. PYSEY is for visual artists, musicians, and developers who work with EYESY-style video synthesis."
https://pysey-synth.com/
Chatterbox Turbo is amazing! Long are gone the days I had to use Google #TTS / Speech Synthesis to listen to my ebooks. First it was Kokoro TTS, then F5-TTS but now Chatterbox Turbo is king. It sounds realistic enough, but the best thing is that it is Fast. I did an entire ebook in about an hour and a half.
Perfect for the
Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design
Bin Zhu, Qianghuai Jia, Tian Lan, Junyang Ren, Feng Gu, Feihu Jiang, Longyue Wang, Zhao Xu, Weihua Luo
https://arxiv.org/abs/2603.28376 https://arxiv.org/pdf/2603.28376 https://arxiv.org/html/2603.28376
arXiv:2603.28376v1 Announce Type: new
Abstract: Deep research agents autonomously conduct open-ended investigations, integrating complex information retrieval with multi-step reasoning across diverse sources to solve real-world problems. To sustain this capability on long-horizon tasks, reliable verification is critical during both training and inference. A major bottleneck in existing paradigms stems from the lack of explicit verification mechanisms in QA data synthesis, trajectory construction, and test-time scaling. Errors introduced at each stage propagate downstream and degrade the overall agent performance. To address this, we present Marco DeepResearch, a deep research agent optimized with a verification-centric framework design at three levels: \textbf{(1)~QA Data Synthesis:} We introduce verification mechanisms to graph-based and agent-based QA synthesis to control question difficulty while ensuring answers are unique and correct; \textbf{(2)~Trajectory Construction:} We design a verification-driven trajectory synthesis method that injects explicit verification patterns into training trajectories; and \textbf{(3)~Test-time scaling:} We use Marco DeepResearch itself as a verifier at inference time and effectively improve performance on challenging questions. Extensive experimental results demonstrate that our proposed Marco DeepResearch agent significantly outperforms 8B-scale deep research agents on most challenging benchmarks, such as BrowseComp and BrowseComp-ZH. Crucially, under a maximum budget of 600 tool calls, Marco DeepResearch even surpasses or approaches several 30B-scale agents, like Tongyi DeepResearch-30B.
toXiv_bot_toot
A domain in $\mathbb C^4$ and its connection with $\mu$-synthesis problem
Sourav Pal, Nitin Tomar
https://arxiv.org/abs/2603.01483 https://arxiv.org/pdf/26…
Learning to Build Shapes by Extrusion
Thor Vestergaard Christiansen, Karran Pandey, Alba Reinders, Karan Singh, Morten Rieger Hannemose, J. Andreas B{\ae}rentzen
https://arxiv.org/abs/2601.22858 https://arxiv.org/pdf/2601.22858 https://arxiv.org/html/2601.22858
arXiv:2601.22858v1 Announce Type: new
Abstract: We introduce Text Encoded Extrusion (TEE), a text-based representation that expresses mesh construction as sequences of face extrusions rather than polygon lists, and a method for generating 3D meshes from TEE using a large language model (LLM). By learning extrusion sequences that assemble a mesh, similar to the way artists create meshes, our approach naturally supports arbitrary output face counts and produces manifold meshes by design, in contrast to recent transformer-based models. The learnt extrusion sequences can also be applied to existing meshes - enabling editing in addition to generation. To train our model, we decompose a library of quadrilateral meshes with non-self-intersecting face loops into constituent loops, which can be viewed as their building blocks, and finetune an LLM on the steps for reassembling the meshes by performing a sequence of extrusions. We demonstrate that our representation enables reconstruction, novel shape synthesis, and the addition of new features to existing meshes.
toXiv_bot_toot
Just got back from middle school STEAM night. I did "Analog Synthesis: The Science of Sound"
“Learn how electronic music was made before computers, using electrical signals instead of digital technology…”
This went so well! I setup an ARP, a Moog, and an oscilloscope. We did waveforms, R2-D2 squeals, Close Encounters, Stranger Things, etc. A teacher let me borrow a big DIY subwoofer and we rattled the windows of that 8th grade math classroom! Haha!
So one of the next scopehal filters on deck for a revamp is the NCO. It takes in a frequency vs time waveform and outputs a frequency modulated sine whose frequency tracks the input.
The refactoring was easy but I'm struggling with how to parallelize it. It's trivial to parallelize synthesis of a fixed frequency sine (since you can easily predict the phase at any future time) but if the frequency can change over time I'm not sure how you could predict the future phase witho…
"Experiences of Academics, Graduates, and Undergraduates in Using Generative AI in Research (Un)Ethically and (Ir)Responsibly: A Systematic Review of Qualitative Synthesis"
https://doi.org/10.1080/10572317.2026.2626122
"Smallwood’s signature synthesis of the Black gospel tradition and a host of musical influences, especially classical traditions, made his pen one of the most distinct in musical history.”
—Braxton Shelley, George Washington Williams Professor of Divinity and Music, commenting in Christianity Today on the death of gospel music giant Richard Smallwood
oooh look what i just found:-
Transform Images Into Sound in the browser
"SpectroTrace converts any image into audio using a fascinating technique called additive synthesis. When you play the generated sound through a spectrogram analyzer, the original image magically reappears." 🦟
https://www.spectrotrace.org/
Replaced article(s) found for cs.GR. https://arxiv.org/list/cs.GR/new
[1/1]:
- BiPO: Bidirectional Partial Occlusion Network for Text-to-Motion Synthesis
Seong-Eun Hong, Soobin Lim, Juyeong Hwang, Minwook Chang, Hyeongyeop Kang
https://arxiv.org/abs/2412.00112 https://mastoxiv.page/@arXiv_csCV_bot/113587599346489745
toXiv_bot_toot
GraphWalker: Agentic Knowledge Graph Question Answering via Synthetic Trajectory Curriculum
Shuwen Xu, Yao Xu, Jiaxiang Liu, Chenhao Yuan, Wenshuo Peng, Jun Zhao, Kang Liu
https://arxiv.org/abs/2603.28533 https://arxiv.org/pdf/2603.28533 https://arxiv.org/html/2603.28533
arXiv:2603.28533v1 Announce Type: new
Abstract: Agentic knowledge graph question answering (KGQA) requires an agent to iteratively interact with knowledge graphs (KGs), posing challenges in both training data scarcity and reasoning generalization. Specifically, existing approaches often restrict agent exploration: prompting-based methods lack autonomous navigation training, while current training pipelines usually confine reasoning to predefined trajectories. To this end, this paper proposes \textit{GraphWalker}, a novel agentic KGQA framework that addresses these challenges through \textit{Automated Trajectory Synthesis} and \textit{Stage-wise Fine-tuning}. GraphWalker adopts a two-stage SFT training paradigm: First, the agent is trained on structurally diverse trajectories synthesized from constrained random-walk paths, establishing a broad exploration prior over the KG; Second, the agent is further fine-tuned on a small set of expert trajectories to develop reflection and error recovery capabilities. Extensive experiments demonstrate that our stage-wise SFT paradigm unlocks a higher performance ceiling for a lightweight reinforcement learning (RL) stage, enabling GraphWalker to achieve state-of-the-art performance on CWQ and WebQSP. Additional results on GrailQA and our constructed GraphWalkerBench confirm that GraphWalker enhances generalization to out-of-distribution reasoning paths. The code is publicly available at https://github.com/XuShuwenn/GraphWalker
toXiv_bot_toot
Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[4/6]:
- Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints
Chuqin Geng, Li Zhang, Mark Zhang, Haolin Ye, Ziyu Zhao, Xujie Si
https://arxiv.org/abs/2602.16954 https://mastoxiv.page/@arXiv_csLG_bot/116102434757760085
- Multi-Probe Zero Collision Hash (MPZCH): Mitigating Embedding Collisions and Enhancing Model Fres...
Ziliang Zhao, et al.
https://arxiv.org/abs/2602.17050 https://mastoxiv.page/@arXiv_csLG_bot/116102517335590034
- MASPO: Unifying Gradient Utilization, Probability Mass, and Signal Reliability for Robust and Sam...
Fu, Lin, Fang, Zheng, Hu, Shao, Qin, Pan, Zeng, Cai
https://arxiv.org/abs/2602.17550 https://mastoxiv.page/@arXiv_csLG_bot/116102581561441103
- A Theoretical Framework for Modular Learning of Robust Generative Models
Corinna Cortes, Mehryar Mohri, Yutao Zhong
https://arxiv.org/abs/2602.17554 https://mastoxiv.page/@arXiv_csLG_bot/116102582216715527
- Multi-Round Human-AI Collaboration with User-Specified Requirements
Sima Noorani, Shayan Kiyani, Hamed Hassani, George Pappas
https://arxiv.org/abs/2602.17646 https://mastoxiv.page/@arXiv_csLG_bot/116102592047544971
- NEXUS: A compact neural architecture for high-resolution spatiotemporal air quality forecasting i...
Rampunit Kumar, Aditya Maheshwari
https://arxiv.org/abs/2602.19654 https://mastoxiv.page/@arXiv_csLG_bot/116125610403473755
- Augmenting Lateral Thinking in Language Models with Humor and Riddle Data for the BRAINTEASER Task
Mina Ghashami, Soumya Smruti Mishra
https://arxiv.org/abs/2405.10385 https://mastoxiv.page/@arXiv_csCL_bot/112472190479013167
- Watermarking Language Models with Error Correcting Codes
Patrick Chao, Yan Sun, Edgar Dobriban, Hamed Hassani
https://arxiv.org/abs/2406.10281 https://mastoxiv.page/@arXiv_csCR_bot/112636307340218522
- Learning to Control Unknown Strongly Monotone Games
Siddharth Chandak, Ilai Bistritz, Nicholas Bambos
https://arxiv.org/abs/2407.00575 https://mastoxiv.page/@arXiv_csMA_bot/112715733875586837
- Classification and reconstruction for single-pixel imaging with classical and quantum neural netw...
Sofya Manko, Dmitry Frolovtsev
https://arxiv.org/abs/2407.12506 https://mastoxiv.page/@arXiv_quantph_bot/112806295477530195
- Statistical Inference for Temporal Difference Learning with Linear Function Approximation
Weichen Wu, Gen Li, Yuting Wei, Alessandro Rinaldo
https://arxiv.org/abs/2410.16106 https://mastoxiv.page/@arXiv_statML_bot/113350611306532443
- Big data approach to Kazhdan-Lusztig polynomials
Abel Lacabanne, Daniel Tubbenhauer, Pedro Vaz
https://arxiv.org/abs/2412.01283 https://mastoxiv.page/@arXiv_mathRT_bot/113587812663608119
- MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition
Mehran Shabanpour, Kasra Rad, Sadaf Khademi, Arash Mohammadi
https://arxiv.org/abs/2502.17457 https://mastoxiv.page/@arXiv_eessSP_bot/114069047434302054
- Tightening Optimality gap with confidence through conformal prediction
Miao Li, Michael Klamkin, Russell Bent, Pascal Van Hentenryck
https://arxiv.org/abs/2503.04071 https://mastoxiv.page/@arXiv_statML_bot/114120074927291283
- SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding
Juhyeon Park, Peter Yongho Kim, Jiook Cha, Shinjae Yoo, Taesup Moon
https://arxiv.org/abs/2503.06437 https://mastoxiv.page/@arXiv_csCV_bot/114142690988862508
- How much does context affect the accuracy of AI health advice?
Prashant Garg, Thiemo Fetzer
https://arxiv.org/abs/2504.18310 https://mastoxiv.page/@arXiv_econGN_bot/114414380916957986
- Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification
Daniel J. Strick, Carlos Garcia, Anthony Huang, Thomas Gardos
https://arxiv.org/abs/2505.06646 https://mastoxiv.page/@arXiv_eessIV_bot/114499319986528625
- Sharp Gaussian approximations for Decentralized Federated Learning
Soham Bonnerjee, Sayar Karmakar, Wei Biao Wu
https://arxiv.org/abs/2505.08125 https://mastoxiv.page/@arXiv_statML_bot/114505047719395949
- HoloLLM: Multisensory Foundation Model for Language-Grounded Human Sensing and Reasoning
Chuhao Zhou, Jianfei Yang
https://arxiv.org/abs/2505.17645 https://mastoxiv.page/@arXiv_csCV_bot/114572928659057348
- A Copula Based Supervised Filter for Feature Selection in Diabetes Risk Prediction Using Machine ...
Agnideep Aich, Md Monzur Murshed, Sameera Hewage, Amanda Mayeaux
https://arxiv.org/abs/2505.22554 https://mastoxiv.page/@arXiv_statML_bot/114589983451462525
- Synthesis of discrete-continuous quantum circuits with multimodal diffusion models
Florian F\"urrutter, Zohim Chandani, Ikko Hamamura, Hans J. Briegel, Gorka Mu\~noz-Gil
https://arxiv.org/abs/2506.01666 https://mastoxiv.page/@arXiv_quantph_bot/114618420761346125
toXiv_bot_toot
One-clock synthesis problems
S{\l}awomir Lasota, Mathieu Lehaut, Julie Parreaux, Rados{\l}aw Pi\'orkowski
https://arxiv.org/abs/2601.04902 https://arxiv.org/pdf/2601.04902 https://arxiv.org/html/2601.04902
arXiv:2601.04902v1 Announce Type: new
Abstract: We study a generalisation of B\"uchi-Landweber games to the timed setting. The winning condition is specified by a non-deterministic timed automaton, and one of the players can elapse time. We perform a systematic study of synthesis problems in all variants of timed games, depending on which player's winning condition is specified, and which player's strategy (or controller, a finite-memory strategy) is sought. As our main result we prove ubiquitous undecidability in all the variants, both for strategy and controller synthesis, already for winning conditions specified by one-clock automata. This strengthens and generalises previously known undecidability results. We also fully characterise those cases where finite memory is sufficient to win, namely existence of a strategy implies existence of a controller. All our results are stated in the timed setting, while analogous results hold in the data setting where one-clock automata are replaced by one-register ones.
toXiv_bot_toot
Crosslisted article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[3/3]:
- Functional Continuous Decomposition
Teymur Aghayev
https://arxiv.org/abs/2602.20857 https://mastoxiv.page/@arXiv_eessSP_bot/116130499236089653
- SpatiaLQA: A Benchmark for Evaluating Spatial Logical Reasoning in Vision-Language Models
Xie, Zhang, Shan, Zhu, Tang, Wei, Song, Wan, Song
https://arxiv.org/abs/2602.20901 https://mastoxiv.page/@arXiv_csCV_bot/116130845273808954
- Some Simple Economics of AGI
Christian Catalini, Xiang Hui, Jane Wu
https://arxiv.org/abs/2602.20946 https://mastoxiv.page/@arXiv_econGN_bot/116130470423837005
- Multimodal MRI Report Findings Supervised Brain Lesion Segmentation with Substructures
Yubin Ge, Yongsong Huang, Xiaofeng Liu
https://arxiv.org/abs/2602.20994 https://mastoxiv.page/@arXiv_eessIV_bot/116130212832138624
- MIP Candy: A Modular PyTorch Framework for Medical Image Processing
Tianhao Fu, Yucheng Chen
https://arxiv.org/abs/2602.21033 https://mastoxiv.page/@arXiv_csCV_bot/116130864279556063
- Empirically Calibrated Conditional Independence Tests
Milleno Pan, Antoine de Mathelin, Wesley Tansey
https://arxiv.org/abs/2602.21036 https://mastoxiv.page/@arXiv_statME_bot/116130690605113562
- Is Multi-Distribution Learning as Easy as PAC Learning: Sharp Rates with Bounded Label Noise
Rafael Hanashiro, Abhishek Shetty, Patrick Jaillet
https://arxiv.org/abs/2602.21039 https://mastoxiv.page/@arXiv_statML_bot/116130572661848449
- Position-Aware Sequential Attention for Accurate Next Item Recommendations
Timur Nabiev, Evgeny Frolov
https://arxiv.org/abs/2602.21052 https://mastoxiv.page/@arXiv_csIR_bot/116130263323086316
- Motivation is Something You Need
Mehdi Acheli, Walid Gaaloul
https://arxiv.org/abs/2602.21064 https://mastoxiv.page/@arXiv_csAI_bot/116130680774678580
- An Enhanced Projection Pursuit Tree Classifier with Visual Methods for Assessing Algorithmic Impr...
Natalia da Silva, Dianne Cook, Eun-Kyung Lee
https://arxiv.org/abs/2602.21130 https://mastoxiv.page/@arXiv_statML_bot/116130610674573081
- Complexity of Classical Acceleration for $\ell_1$-Regularized PageRank
Kimon Fountoulakis, David Mart\'inez-Rubio
https://arxiv.org/abs/2602.21138 https://mastoxiv.page/@arXiv_mathOC_bot/116130547076073836
- LUMEN: Longitudinal Multi-Modal Radiology Model for Prognosis and Diagnosis
Jiang, Yang, Nath, Parida, Kulkarni, Xu, Xu, Anwar, Roth, Linguraru
https://arxiv.org/abs/2602.21142 https://mastoxiv.page/@arXiv_csCV_bot/116130871488694585
- A Benchmark for Deep Information Synthesis
Debjit Paul, et al.
https://arxiv.org/abs/2602.21143 https://mastoxiv.page/@arXiv_csAI_bot/116130692571594706
- Scaling State-Space Models on Multiple GPUs with Tensor Parallelism
Anurag Dutt, Nimit Shah, Hazem Masarani, Anshul Gandhi
https://arxiv.org/abs/2602.21144 https://mastoxiv.page/@arXiv_csDC_bot/116130520888343997
- Not Just How Much, But Where: Decomposing Epistemic Uncertainty into Per-Class Contributions
Mame Diarra Toure, David A. Stephens
https://arxiv.org/abs/2602.21160 https://mastoxiv.page/@arXiv_statML_bot/116130618512594211
- Aletheia tackles FirstProof autonomously
Tony Feng, et al.
https://arxiv.org/abs/2602.21201 https://mastoxiv.page/@arXiv_csAI_bot/116130705679345625
- Squint: Fast Visual Reinforcement Learning for Sim-to-Real Robotics
Abdulaziz Almuzairee, Henrik I. Christensen
https://arxiv.org/abs/2602.21203 https://mastoxiv.page/@arXiv_csRO_bot/116130765974498223
toXiv_bot_toot
Crosslisted article(s) found for cs.FL. https://arxiv.org/list/cs.FL/new
[1/1]:
- Learning specifications for reactive synthesis with safety constraints
Kandai Watanabe, Nicholas Renninger, Sriram Sankaranarayanan, Morteza Lahijanian
https://arxiv.org/abs/2601.05533 https://mastoxiv.page/@arXiv_csRO_bot/115881279625696255
toXiv_bot_toot