
2025-10-14 09:51:03
On Website Technicals (2025-06) - Tech updates: Junited - Rigby to Buttersafe - GPTBot badness, captions, diversion delay, under-volt, X11 fossil. #Junited2025 - https://www.earth.org.uk/note-on-site-tech
On Website Technicals (2025-06) - Tech updates: Junited - Rigby to Buttersafe - GPTBot badness, captions, diversion delay, under-volt, X11 fossil. #Junited2025 - https://www.earth.org.uk/note-on-site-tech
Reading about Baldur von Schirach. Sounds familiar.
“In February 1928 he became a university group leader of the National Socialist German Students' League.”
“He worked to broaden the Nazi Party's appeal to the bourgeoisie. Schirach was supported by Hitler in internal elections, who also wanted the Nazi Party to have a broad social base.”
“Schirach was skilled at bureaucratic power struggles. He founded the School Children's Leagues (Schülerbünde) to create competition to the Hitler Youth. He made an ally of Joseph Goebbels.”
“Schirach was named national youth leader of the party in 1931.”
“With Heinrich Hoffmann, Schirach produced several propaganda books of Hoffmann's photographs, including "Hitler As No One Knows Him", "Youth Around Hitler", and "Hitler in His Mountains". Schirach wrote the captions. The books sold hundreds of thousands of copies, earning Schirach and Hoffmann substantial royalties.”
“On 16 June 1932, he was made Reichsführer of the Party's Hitler Youth organization, and resigned from the Student League. Under Schirach, the Hitler Youth stewarded NSDAP events, and 21 members died in 1932. Schirach described these deaths as "blood sacrifice" for propaganda purposes. One example was Herbert Norkus, a fifteen-year-old boy who was stabbed to death by Communists. In a 31 May 1932 speech, Schirach recounted Norkus's death and called for a "National Socialist dictatorship". Schirach gave a memorial speech on the third anniversary of Norkus's death in January 1935.”
#hitleryouth #fascism #theAmericanFascist
MiDashengLM: Efficient Audio Understanding with General Audio Captions
Heinrich Dinkel, Gang Li, Jizhong Liu, Jian Luan, Yadong Niu, Xingwei Sun, Tianzi Wang, Qiyang Xiao, Junbo Zhang, Jiahao Zhou
https://arxiv.org/abs/2508.03983
CapTune: Adapting Non-Speech Captions With Anchored Generative Models
Jeremy Zhengqi Huang, Calu\~a de Lacerda Pataca, Liang-Yuan Wu, Dhruv Jain
https://arxiv.org/abs/2508.19971
Addressing the ID-Matching Challenge in Long Video Captioning
Zhantao Yang, Huangji Wang, Ruili Feng, Han Zhang, Yuting Hu, Shangwen Zhu, Junyan Li, Yu Liu, Fan Cheng
https://arxiv.org/abs/2510.06973
It is really calling watch a guy with a heavy German accent on a YouTube video and see that the automatically generated captions are basically perfect, and I can't dictate a single sentence in perfect English into my $1400 flagship device without making a correction
*galling
#iOS26
Meta meta meta...
WTF is with every video having word-flash captions? The one in this toot is an example of one of multiple constant-flux caption style. THAT'S NOT HOW PEOPLE READ!
I can barely watch such videos. https://journa.host/@lolgop/115119135266797749
On Website Technicals (2020-02) - Tech updates: GSC Review annoyance, CSS dark mode, video captions, lazy loading, srcset issues. - https://www.earth.org.uk/note-on-site-technicals-33.html
Reason #2608 I do not trust “AI” to generate captions or transcripts:
“Complete silence is always hallucinated as 'ترجمة نانسي قنقر' in Arabic which translates as 'Translation by Nancy Qunqar'”
More examples in replies.
#a11y #accessibility
MATRIX: Mask Track Alignment for Interaction-aware Video Generation
Siyoon Jin, Seongchan Kim, Dahyun Chung, Jaeho Lee, Hyunwook Choi, Jisu Nam, Jiyoung Kim, Seungryong Kim
https://arxiv.org/abs/2510.07310
Clarification as Supervision: Reinforcement Learning for Vision-Language Interfaces
John Gkountouras, Ivan Titov
https://arxiv.org/abs/2509.26594 https://a…
BLUEX Revisited: Enhancing Benchmark Coverage with Automatic Captioning
Jo\~ao Guilherme Alves Santos, Giovana Kerche Bon\'as, Thales Sales Almeida
https://arxiv.org/abs/2508.21294
(YouTube, Chinese w/o captions but graphical English subtitles) Blind patient treated with ZM-02 optogenetic gene therapy #RP
Sound event detection with audio-text models and heterogeneous temporal annotations
Manu Harju, Annamaria Mesaros
https://arxiv.org/abs/2508.20703 https://…
Learning Transferable Facial Emotion Representations from Large-Scale Semantically Rich Captions
Licai Sun, Xingxun Jiang, Haoyu Chen, Yante Li, Zheng Lian, Biu Liu, Yuan Zong, Wenming Zheng, Jukka M. Lepp\"anen, Guoying Zhao
https://arxiv.org/abs/2507.21015
On Website Technicals (2025-06) - Tech updates: Junited - Rigby to Buttersafe - GPTBot badness, captions, diversion delay, under-volt, X11 fossil. #Junited2025 - https://www.earth.org.uk/note-on-site-tech
LOTUS: A Leaderboard for Detailed Image Captioning from Quality to Societal Bias and User Preferences
Yusuke Hirota, Boyi Li, Ryo Hachiuma, Yueh-Hua Wu, Boris Ivanovic, Yuta Nakashima, Marco Pavone, Yejin Choi, Yu-Chiang Frank Wang, Chao-Han Huck Yang
https://arxiv.org/abs/2507.19362
Jamendo-QA: A Large-Scale Music Question Answering Dataset
Junyoung Koh, Soo Yong Kim, Yongwon Choi, Gyu Hyeong Choi
https://arxiv.org/abs/2509.15662 https://
One Patch to Caption Them All: A Unified Zero-Shot Captioning Framework
Lorenzo Bianchi, Giacomo Pacini, Fabio Carrara, Nicola Messina, Giuseppe Amato, Fabrizio Falchi
https://arxiv.org/abs/2510.02898 …
Text2Move: Text-to-moving sound generation via trajectory prediction and temporal alignment
Yunyi Liu, Shaofan Yang, Kai Li, Xu Li
https://arxiv.org/abs/2509.21919 https://
MAGIC-Enhanced Keyword Prompting for Zero-Shot Audio Captioning with CLIP Models
Vijay Govindarajan, Pratik Patel, Sahil Tripathi, Md Azizul Hoque, Gautam Siddharth Kashyap
https://arxiv.org/abs/2509.12591
Aligning Audio Captions with Human Preferences
Kartik Hegde, Rehana Mahfuz, Yinyi Guo, Erik Visser
https://arxiv.org/abs/2509.14659 https://arxiv.org/pdf/2…
Long Story Short: Disentangling Compositionality and Long-Caption Understanding in VLMs
Israfel Salazar, Desmond Elliott, Yova Kementchedjhieva
https://arxiv.org/abs/2509.19207 …
JEPA-T: Joint-Embedding Predictive Architecture with Text Fusion for Image Generation
Siheng Wan, Zhengtao Yao, Zhengdao Li, Junhao Dong, Yanshu Li, Yikai Li, Linshan Li, Haoyan Xu, Yijiang Li, Zhikang Dong, Huacan Wang, Jifeng Shen
https://arxiv.org/abs/2510.00974
From the Ministry of Truth:
#resist #authoritarianism #fascism #news
LucidFlux: Caption-Free Universal Image Restoration via a Large-Scale Diffusion Transformer
Song Fei, Tian Ye, Lujia Wang, Lei Zhu
https://arxiv.org/abs/2509.22414 https://
Towards Robust Speech Recognition for Jamaican Patois Music Transcription
Jordan Madden, Matthew Stone, Dimitri Johnson, Daniel Geddez
https://arxiv.org/abs/2507.16834 https://
SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning
Si-Woo Kim, MinJu Jeon, Ye-Chan Kim, Soeun Lee, Taewhan Kim, Dong-Jin Kim
https://arxiv.org/abs/2507.18616
Enhancing Remote Sensing Vision-Language Models Through MLLM and LLM-Based High-Quality Image-Text Dataset Generation
Yiguo He, Junjie Zhu, Yiying Li, Xiaoyu Zhang, Chunping Qiu, Jun Wang, Qiangjuan Huang, Ke Yang
https://arxiv.org/abs/2507.16716